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Abstract 

Influenza has been circulating in the human population and has caused three pandemics in the last 
century (1918 HlNl, 1957 H2N2, 1968 H3N2). The 2009 A(HINI) was classified by the World Health 
Organization (WHO) as the fourth pandemic. Influenza has a high evolution rate, which makes vaccine 
design challenging. We here consider an approach for early detection of new dominant strains. By 
clustering the 2009 A (HlNl) sequence data, we found two main clusters. We then define a metric to 
detect the emergence of dominant strains. We show on historical H3N2 data that this method is able 
to identify a cluster around an incipient dominant strain before it becomes dominant. For example, for 
H3N2 as of March 30, 2009, the method detects the cluster for the new A/British Columbia/RVl 222/2009 
strain. This strain detection tool would appear to be useful for annual influenza vaccine selection. 
Keywords: clustering/HlNl/H3N2/influenza 

Introduction 

The recent outbreak of 2009 A(HINI) caused immediate international attention [1-4]. This new 2009 
A(HINI) virus contains a combination of gene segments from swine and human influenza viruses [2,3]. 
Confirmed infections reached 270,000 globally as of September 2009 [5]. The novel 2009 A(HINI) strain 
was defined as a pandemic strain by the World health Organization(WHO) in 2009 [6], and was the 
epidemic strain in the 2009 Northern winter. 

Influenza viruses are hyper- mutating viruses. It has been estimated that the nucleotide mutation 
rate per genome per replication is approximately 0.76 [7]. Influenza viruses escape the human immune 
system by continual antigenic drift and shift [8-13]. The quasispecies nature of influenza viruses makes 
the strain structure complex [14]. Usually, there is one or a few dominant influenza strains circulating 
in the population for each flu season. The flu vaccine is most effective when it matches this dominant 
circulating strain [15, 16]. The degree to which immunity induced by a vaccine protects against a different 
viral strain is determined by the antigenic distance between the vaccine and the virus. Due to evolution 
of the antigenic regions of the influenza virus, the composition of the flu vaccine is typically modified 
annually [17]. However, since the influenza strains used in the flu vaccine are decided 6 months before 
the flu season, a mismatch between the vaccine strain and dominant circulating strain may occur if the 
virus evolves significantly. Such a situation arose for the H3N2 virus in the 2009-2010 flu season, when 
A/British Columbia/RVl 222/2009 emerged in the early spring [18,19]. Accurate early prediction of the 
dominant circulating strain is an essential and important task in influenza research. 

Understanding the evolution of influenza viruses has benefited from phylogenetic reconstructions of 
the hemagglutinin protein evolution [11,20]. In an alternative approach, Lapedes and Farber [21], followed 
by Smith et al [22] , applied a technique called multidimensional scaling to study antigenic evolution of 
influenza. Plotkin et al. clustered hemagglutinin protein sequences using the single-linkage clustering 
algorithm and found that influenza viruses group into clusters [23]. 

Here, we present a low-dimensional clustering method that can detect the cluster containing an 
incipient dominant strain for an upcoming flu season before the strain becomes dominant. The method 
builds upon the dimensional projection technique used by Lapedes and Farber [21] and Smith et al [22] 
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to characterize hemagglutination inhibition data. In this paper, we first study the evolution of 2009 
A(HINI) by an evolutionary path map which leads to a suggestion for the HlNl vaccine strain. Then, 
we introduce the low-dimensional protein sequence clustering method. We propose an influenza vaccine 
selection procedure based on this sequence clustering. The procedure is demonstrated and tested in detail 
using historical data. We show the performance of the method to predict the dominant H3N2 strain in 
an upcoming flu season using data solely from before the flu season, on data since 1996. We compare the 
results to those from existing methods since 1996. In the discussion section, we discuss the relationship 
between the protein sequence clustering method and previous approaches. We discuss the false positive 
rate, as well as other challenges. 

Results 

Evolutionary path of 2009 A(HINI) influenza 

We first construct the directional evolutionary path for the 2009 A(HINI) influenza. We use high reso- 
lution data in sequence, time, and world spatial coordinate to construct this evolutionary relationship. 
Since its first detection, the 2009 A(HINI) virus has been extensively sequenced [2,3]. By May 1, 2009, 
the number of confirmed cases reported by WHO was 333 [5]. At the same time, the sequenced hemag- 
glutinin protein (HA) available in NCBI Influenza Resources Database were 312 [24]; that is to say most 
of the confirmed cases at that time were sequenced. At July 1, 2009, the ratio of sequenced HA protein to 
confirmed cases by WHO was 1039/77201 [5], a number which is still much larger than that for seasonal 
flu. In addition, the Influenza Resources Database contains the date of collection of each 2009 A (HlNl) 
virus strain. We reconstruct the evolutionary history of swine flu viruses with the following procedure. If 
strain B is mutated from strain A, we term strain A "founder" and strain B "Fl" We align the HA pro- 
teins of all 2009 A(HINI) strains. Then, for each strain, we find its founder strain based on the following 
four criteria: 1, the founder strain should appear earlier than the strain, as judged by collection date; 2, 
the founder strain should have only one amino acid difference in the HAl protein relative to the Fl strain; 
3, the founder should also have the most similar nucleotide sequence relative to Fl; and 4, the founder 
strain should have a large number of identical copies circulating in human population, as approximated 
by the number of different strains with identical HA sequences in the Influenza Resources Database. By 
applying these four criteria to 2009 A(HINI) influenza, we construct the directional evolutionary path 
map, as shown in Fig. 1. We can see two clusters: one around A/New York/19/2009 (#28), and another 
one around A/Texas/05/2009 (#12). Most new strains are from the Northern hemisphere, and strains 
from the Southern hemisphere are mainly located at the edge of the map, such as strain #96, #120, and 
#126. Geographically, we see many founder to Fl links are from US and Mexico to other countries, but 
we rarely see founder to Fl links that are from other countries to US and Mexico, or from other countries 
to other countries except US and Mexico (see Materials and Methods). We also found that strains with 
more Fl in Fig. 1 are more frequently seen in the human population. For example, in the Influenza 
Resources Database, we found 153 strains to be identical with A/New York/19/2009, which has 29 Fl 
strains, and 120 strains to be identical with A/Texas/05/2009, which has 24 Fl strains. We can see in 
Fig. 1 that A/Texas/05/2009 is at the very upstream of the map, with downward connections to most 
of the other strains by direct or two-step links. This result agrees with the US Food and Drug Adminis- 
tration [25] recommendation of A/Texas/05/2009 as a vaccination strain. The alternative vaccine strain 
A/ California/ 7/ 2009 (#7) has fewer Fl strains and it is not located at the center of the network. 

Low-dimensional clustering 

We use a low-dimensional clustering method to visualize the antigenic distance matrix of the viruses. 
We use a statistical tool called "multidimensional scaling" [26]. This method was used by Lapedes 
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and Farber [21] and Smith et al. [22] to project ferret hemagglutination inhibition assay data to low 
dimensions. The influenza viral surface glycoprotein hemagglutinin is a primary target of the protective 
immune response. Here we project the hemagglutinin protein sequence data, rather than animal model 
data, to low dimensions. The HAl protein of influenza with 329 residues can be considered as a 329- 
dimension space. The multidimensional scaling method is applied to rescale the 329-dimension space to 
a 2-dimensional space, so that we can plot and visualize it. First, we do a multialignment of the HAl 
proteins. Then, the distance between any two proteins is calculated as 



where si^m is the amino acid of protein i at position m. The term ^si,^,sj,^ is 1 if amino acids of protein 
i and j at position m are the same. Otherwise, it is 0. For the 2009 HlNl viruses, we consider the entire 
HA protein, and = 566. For H3N2 viruses, we consider only the HAl protein, and N = 329, because 
the entire HA proteins are not completely sequenced in many cases. Thus, dij is the number of amino 
acid differences between HA proteins normalized by length. The multidimensional scaling produces a 
protein distance map, for example. Fig. 2(b). In this map, each data point represents a flu strain isolate. 
The Euclidean distance between two points in the map approximates the protein distance in Equation 
1 between these two flu strains (see Materials and Methods for details of this distance approximation 
procedure). Two closely located points imply two strains with similar HA protein sequences. 

We apply the low-dimensional clustering method to study 2009 A(HINI). We plot the protein distance 
map in Fig. 2(b). Both A/Texas/05/2009 and A/New York/19/2009 are located near the center of the 
cluster, in good agreement with the observation from Fig. 1 that they are the founder strains for many Fl 
strains. To detect the clusters in the protein distance map, we use a statistical method known as kernel 
density estimation [26] . Kernel density estimation is a non-parametric method to estimate the probability 
density function from which data come. The kernel density flgure is produced from the protein distance 
map, and it shows the density of influenza strains in sequence space. We plot the kernel density as the 
three dimensional shaded surface. For example, the kernel density surface Fig. 2(a) is produced from Fig. 
2(b). The x and y axes in Fig. 2(a) are the same as that in Fig. 2(b) and are protein distance coordinates. 
The z dimension measures the density of flu strains around point (x^y). We use the surface height and 
the colors to represent z values, and the color is proportional to surface height. A peak in kernel density 
Fig. 2(a) indicates a cluster of related flu strains in the protein distance map Fig. 2(b) 

There are two signiflcant clusters in the Fig. 2(a), as two peaks are observed. The cluster on the left 
side contains A/Texas/05/2009. Another cluster on the right side contains A/New York/19/2009. The 
2009 A(HINI) virus has evolved slowly to date. The greatest ^epitope antigenic distance (deflnition in the 
Materials and Methods) between A/Texas/05/2009 and all sequenced strains is measured to be < 0.08. 
Values of Pepitope less than 0.45 for HlNl indicate positive expected vaccine eflicacy [27], and so a vaccine 
is expected to be efl&cacious. All of the amino acids in all flve epitopes of a strain of A/Texas/05/2009 and 
a strain of A/New York/19/2009 are the same. Multidimensional scaling predicts that A/Texas/05/2009 
will be the dominant strain in the 2009-2010 season, and that A/Texas/05/2009 is a suitable strain for 
vaccination. Our focus is on the expected vaccine effectiveness, as it can be judged from antisera HI 
assay or sequence data alone. We do not consider other aspects such as growth in hen's eggs or other 
manufacturing constraints. Laboratory growth and passage data are needed to address these aspects. 

H3N2 virus evolution for 40 years 

We construct the protein distance map to determine the evolution of influenza A(H3N2) virus from 1969 to 
2007. Sequences of HAl proteins were downloaded from the Influenza Virus Resources database [24]. We 
use the multidimensional clustering method [21] to generate the protein distance map and corresponding 
kernel density estimation in Fig. 3. Smith et al. [22] produced a similar graph using ferret antisera HI 
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assay data. The figure presented here has a higher resolution, and more clusters are observed, because 
protein sequences data are more abundant and accurate than antisera HI assay data. The evolution of 
influenza tends to group strain into clusters. In Fig. 3, we identifled 14 major clusters by setting a cutoff 
value of kernel density. We marked each cluster by the flrst vaccine strain in the cluster. The average 
duration time for a cluster is 2.7 years, which is also the approximate duration of a vaccine. There are 
apparent gaps between clusters. The antigenic distance between two strains in two separate clusters is 
larger than the distances within the same cluster. The influenza virus evolves within one cluster before 
jumping from one cluster to another cluster. This dynamics occurs because small antigenic drift by one 
or a few sequential mutations does not lead the virus to completely escape from cross immunity induced 
by vaccine protection or prior exposure. 

For vaccine design, when the viruses evolve as a quasispecies in the same cluster, the vaccine that 
is targeted to the cluster provides protection. This protection decreases with antigenic distance. When 
the viruses jump to a new cluster by antigenic drift or shift, one would want to update the vaccine to 
provide protection against strains in the new cluster. In Fig. 3(a), the arrows point to the exact position 
of vaccine strains. It can be seen that the positions of vaccine strains are near the center of clusters. It 
can be shown mathematically that choosing the consensus strain of a cluster as vaccine strain minimizes 
the ]?epitope antigenic distance between vaccine strain and cluster strains, and thus maximizes expected 
vaccine efficacy. 

Influenza vaccine strain selection 

We now use the low-dimensional sequence clustering method in an effort to detect a new flu strain before 
it becomes dominant. A question of interest in the influenza research is whether we can predict which 
strain will be dominant in the next flu season based on the information we have at present. The WHO 
gathers together every February to make a recommendation for influenza strains to be used in vaccine 
for next flu season in the Northern hemisphere. The vaccine is expected to have high efficacy if the 
chosen strain is dominant in the next flu season. The recommendation is especially challenging to make 
when the dominant strain in next flu season has not been dominant before February of that year. For 
example, in mid-March 2009, a new H3N2 strain appeared [18,19], which infected a signiflcant fraction 
of the population in the Southern hemisphere. 

The current accepted influenza vaccine strain selection procedure is as follows [17]. Isolates samples are 
collected by WHO GISN and are characterized antigenically using the hemagglutination inhibition(HI) 
assay. About 10% of samples are also sequenced in HAl domain of HA gene. Antigenic maps are 
constructed from the HI assay data using dimensional projection technique. Examination of HI data is 
not dependent on analysis using dimensional projection, but rather, the primary HI data may carry the 
most weight. If the vaccine does not match the current circulating strains, the vaccine is updated to 
contain one representative of the circulating strains. The emerging variant strains are identifled. If the 
antigenically distinct emerging variants are judged to be the dominant strains in the upcoming season, 
the vaccine is updated to include one representative of emerging variants. The key issue and major 
difficulty is how to judge whether emerging variants will be the dominant variants in next season. If a 
fourfold difference in antisera HI titer between the vaccine strain and the emerging strains is observed, 
the emerging strain is to be determined to be dominant strains in upcoming season, and an updated 
vaccine is recommended to include the emerging strains [17]. 

Here, we propose a modified vaccine selection process based on clustering detection. First, we apply 
the multidimensional scaling to make a protein distance map from HAl sequences, instead of construct- 
ing an antigenic map from HI assay data. Then, we use kernel density estimation to determine the 
clusters of strains. If the vaccine does not match the current circulating cluster, the vaccine is up- 
dated to contain the current circulating strain. If the vaccine matches the current circulating cluster, 
but an emerging cluster is judged likely to be the major cluster in the upcoming season, the vaccine 
is updated to contain the consensus strain of the emerging cluster. We judge whether a cluster is an 
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emerging dominant cluster by two criteria. The first criterion is that this cluster can be detected by 
kernel density estimation, and is separate from the cluster that contains the current circulating strain 
or vaccine strain. A cluster that can be detected by kernel density estimation usually contains a cen- 
tral strain that has multiple identical copies and some Fl strains that are closely related to the central 
strain. An example is the cluster of A/Texas/05/2009(HlNl) in Fig. 1. A/Texas/05/2009(HlNl) is 
the central strain, which has 120 strains with identical HA protein sequences in the Influenza Virus Re- 
source database [24]. A/Texas/05/2009(HlNl) also has 29 Fl strains with one amino acid different. So, 
A/Texas/05/2009(HlNl) and the surrounding strains form a cluster as we detected in Fig. 2 by kernel 
density estimation. 

The second criterion is that the current vaccine strain does not match the consensus strain of the 
cluster and is estimated to provide low protection against strains in the cluster. That is, an immune 
response stimulated by a vaccine cannot effectively protect against infection by sufficiently distant by 
new strains. The consensus strain is a protein sequence that shows which residues are most abundant in 
the multialignment at each position. The efficacy of current vaccine to the new cluster can be estimated 
from ferret antisera HI assay data. However, the antisera data has low resolution and has an imperfect 
correlation to vaccine effectiveness in humans [16,28]. Instead, we use Pepitope, which is calculated as 
the fraction of mutations in dominant epitope, to estimate vaccine efficacy and which has a more robust 
correlation to vaccine effectiveness in human than do ferret HI data [16]. When the j^epitope between the 
current vaccine strain and consensus strain of the new cluster is larger than 0.19, expected vaccine efficacy 
decreases to for H3N2 influenza, and the current vaccine cannot be expected to provide protection from 
new strains. As the examples shown below, our method can detect an incipient dominant strain at its 
very early stage, and the method appears to require about 10 sequences in the new cluster for detection. 

Demonstration of low-dimensional sequence clustering method. 

We demonstrate the method of detecting the A/Fujian/411/2002(H3N2) strain. The A/Panama/2007/1999 
had been the vaccination strain for four flu seasons between 2000 and 2004 in the Northern hemisphere. 
The vaccine strain was replaced by A/Fujian/41 1/2002 (H3N2) in the 2004-2005 flu season, as described 
in Table 1. The vaccine strain in the 2003-2004 season was A/Panama/2007/1999, while the dominant 
circulating strain became A/Fujian/411/2002(H3N2). This mismatch resulted in a large decrease in vac- 
cine efficacy in the 2003-2004 flu season [16]. The vaccine efficacy is estimated to be only 12% [29]. We 
test whether our method can detect A/Fujian/411/2002(H3N2) as an incipient dominant strain before 
it actually became dominant. We use only virus sequence data before October 1, 2003. We did not use 
any virus data collected in 2003-2004 season. Therefore, our prediction and results are made without any 
knowledge from what happened in the 2003-2004 season. We plot the protein distance map of the 2001- 
2002 flu season in Fig. 4(d). To detect the clusters, we plot the kernel density in Fig. 4(b) for the data 
in Fig. 4(d). There are two separate significant clusters. The one with the largest kernel density on the 
left contains the current dominant strain A/Panama/2007/1999 and the widespread A/Moscow/10/1999 
strain. The smaller one on the right is a new cluster, which contains A/Fujian/41 1/2002. Using the data 
as of September 30, 2002, we seek to determine whether the new cluster on the right in Figs. 4(b) and 
(d) will be the next dominant strain after A/Panama/2007/1999. We determine whether this cluster 
fulfills the two criteria above. First, this new cluster can be significantly detected by kernel density 
estimation. This cluster is separate from the current dominant strain, as we can see in figure. Second, 
we calculated the average j^epitope of the new cluster on the right with regard to A/Moscow/10/1999, 
A/Panama/2007/1999 and A/Fujian/41 1/2002 to be 0.214, 0.1214, and 0.083, respectively. This means 
the current vaccine contains A/Moscow/10/1999 is expected to provide little protection against viruses 
in the new cluster. This result makes the new cluster fulfill the second criterion. Thus, we predict based 
on the data as of September 30, 2002, that the cluster on the right in Fig. 4(d) will be the next dominant 
cluster. This prediction was made on data collected one year earlier than when the A/Fujian/41 1/2002 
became dominant in the 2003-2004 season. To further support our prediction, in Fig. 4(c), we plot the 
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protein distance map from October 1, 2002, to February 1, 2003, right before the WHO selected the 
vaccine strain for 2003-2004 season. To detect the clusters, we plot the kernel in Fig. 4(a) for the data in 
Fig. 4(c). There are two separate major clusters observed in the kernel density estimation in Fig. 4(a). 
The left cluster has the current dominant strain of A/Panama/2007/1999 and also A/Moscow/10/1999. 
The right cluster has the A/Fujian/411/2002. We calculated the average Pepitope of the right new clus- 
ter with regards to A/Moscow/10/1999, A/Panama/2007/1999, and A/Fujian/4 11/2002 to be 0.2725, 
0.1811, and 0.0367 respectively. This result further supports the prediction that the new cluster will 
become dominant, and A/Fujian/41 1/2002, which is the most frequent strain in the new cluster, will be 
or is very close to the next dominant strain. This suggestion proceeds the vaccine component switch by 
1-2 years, as shown in Table 1. 

Prediction for H3N2 influenza in 2009-2010. 

By applying our method to the 2008-2009 flu season, we predict that the dominant H3N2 strain in 
the 2009-2010 flu season may switch. Based on the flu activity in the 2008-2009 flu season, the WHO 
made the recommendation in February 2009 that A/Brisbane/10/2007(H3N2) should be used as the 
vaccine [30]. However, a new strain evolved just after the recommendation was published. The British 
Columbia Center for Disease Control detected a new virus strain [18, 19] with 3 mutations in antigenic 
sites (two in epitope B and one in epitope D). Since this new strain is relatively far from the vaccine 
strain, with ^epitope = 0.095, vaccine efficacy is expected to decrease to 20% [1,16]. However, since the 
mutations in this new strain "do not fulflll the criteria proposed by Cox as corresponding to meaningful 
antigenic drift" [18,31], and this strain still remained the minority of H3N2 viruses in July 2009, health 
authorities were not certain that this new strain would replace the current dominant strain in 2009-2010 
flu season. We use our method to investigate whether this new strain will be the next dominant strain. 
We construct the protein distance map as shown in Fig. 5(c). We plot the kernel density estimation 
in Fig. 5(a) for data in Fig. 5(c). By the data up to June 14, 2009, we see two major clusters in Fig. 
5(a). The larger one on the right contains the current dominant strain A/Brisbane/ 10/2007, and the 
left one is a new cluster which contains A/British Columbia/RV1222/2009. It is apparent that this new 
cluster is separate from the current dominant cluster. Thus, this cluster fulfills the first criterion. We 
calculated the average of ^epitope of strains in the left new cluster with regards to A/Brisbane/10/2007 
and A/British Columbia/RV1222/2009 to be 0.103 and 0.042 respectively. The vaccine that contains 
A/Brisbane/10/2007 has an expected efficacy of 20% to the virus strains in the new cluster. Thus, 
this new cluster satisfies both two criteria, and so we predict that this cluster which contains A/British 
Columbia/RV1222/2009 will be the dominant cluster in the 2009-2010 season. The earliest time for us 
to make this prediction is March 30, 2009. In Fig. 5(d) and (b), we already see this new cluster on the 
left side of figure, though since there are only about 10 sequences in the new cluster, the kernel density of 
this new cluster is smaller than that in the dominant cluster. This strain was mentioned as a concern on 
5 May 2009, although by conventional methods the strain was not considered a potentially new dominant 
strain in July 2009 [18]. With the method of the present paper, this new cluster is suggested earlier using 
the data as of March 30, 2009. 

Comparison with previous results. 

Here we present a historical test of the method. For each flu season in the North Hemisphere from 1996, 
we use only the H3N2 sequences data until February 1, before WHO published the recommendation 
for vaccine. We use the low dimensional clustering to made the prediction for the dominant strain. 
The conventional method as used by WHO is phylogenetic analysis combined with ferret antisera HI 
assay. In Table 1, we compare the method with the conventional method. In the most recent 14 flu 
seasons, influenza subtype H3 was dominant in 10. The WHO H3N2 vaccine component matches the 
circulating strains in 8 seasons. Our predictions match the circulating strains in 9 seasons. In 1997-1998 
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season, a novel flu strain Sydney/5/97 was found in June 1997. Because no similar strains were collected 
before February 1, neither of the two methods can predict it. In 2003-2004 season, our method predicts 
Fujian/44 1/2002 as the dominant strain, while phylogenetic analysis combined with ferret antisera HI 
assay did not. For all other 8 seasons dominated by influenza subtype H3, the predictions of both 
methods matched the dominant circulating strain. The 2009-2010 influenza season was dominated by 
HlNl. But data from local outbreaks of H3N2 infections [18, 19] showed that the dominant H3N2 
strain was A/British Columbia/RVl 222/2009, as predicted in Table 1, rather than the vaccine strain 
A/Brisbane/10/2007. For the 2010-2011 season, we recommend A/British Columbia/RV1222/2009 as a 
vaccine strain, and the WHO recommended A/Perth/16/2009. These two strains are in the same cluster 
and antigenically similar with a small ^epitope = 0.048. Although these two strains are slightly different, 
the vaccine is expected to be effective. 

Detecting A/ Wellington/ 1/2004 in the 2004 flu season in the Southern hemi- 
sphere 

The low- dimensional clustering can also be applied to influenza in the Southern hemisphere. As an 
example, we test our method on the 2004 flu season. The recommended H3N2 vaccine strain by WHO used 
in the 2004 flu season in the Southern hemisphere was A/Fujian/41 1/2002. Data from the surveillance 
network suggested that the circulating dominant flu strain in the 2004 season in Southern hemisphere 
was A/Fujian/41 1/2002, and a late surge of A/ Wellington/ 1/2004 was also observed. For example, 
in Argentina, a study showed that about 50% of infections were closely related to A/Fujian/41 1/2002 
and another 50% were closely related to A/ Wellington/ 1/2004 [32]. In New Zealand, the dominant 
flu strain was A/Fujian/41 1/2002 which caused 78% of flu infections [33], and a late season surge of 
A/ Wellington/ 1/2004 was also reported [34]. Therefore, the vaccine recommended by WHO matches 
the dominant strain and would be expected to have vaccine efficacy in the 2004 season in Southern 
hemisphere. 

We here use the low- dimensional clustering method to detect the A/ Wellington/ 1/2004 strain, which 
is not the major dominant strain but caused significant infections in the 2004 flu season. We plot the 
protein distance and kernel density estimation for the H3N2 viruses in Fig. 6(d) and 6(b). We use the 
data only as of February 1, 2004, 3 months prior to the 2004 flu Southern hemisphere season, which is 
usually from May to September. We observed two clusters. The major cluster on the left side of Fig. 6(d) 
is A/Fujian/41 1/2002-like, which was the vaccine strain in 2004 season. There is a new cluster in the right 
side of Fig. 6(d) which contains A/Wellington/ 1/2004. The Pepitope of A/ Wellington/ 1/2004 with regards 
to A/Fujian/411/2002 is 0.118. Therefore, we predict that A/ Wellington/ 1/2004 wih infect a large frac- 
tion of the population, and the A/Fujian/41 1/2002 vaccine is expected to provide only partial protection 
against the A/ Wellington/ 1/2004 virus. However, since the appearance of A/ Wellington/ 1/2004 was just 
before the the 2004 flu season, it did not have sufficient time to spread out and become the dominant 
strain in the 2004 fiu season. From our observation, it usually takes about 8 months or longer for a new 
strain to become dominant after its appearance in a new cluster. Therefore, the predominant flu strain in 
2004 season is expected to be A/Fujian/41 1/2002 based on the data as of February 1, 2004. This result 
agrees with the dominant fiu strain in the 2004 fiu season. 

Detecting A/Cahfornia/4/2004 as a future dominant strain 

As a further example of applying the low- dimensional clustering method to influenza in Southern hemi- 
sphere, we test the method on the 2005 fiu season. The recommended H3N2 vaccine strain in the 2005 fiu 
season in the Southern hemisphere was A/ Wellington/ 1/2004. Data from HI assay tests and surveillance 
suggest that the dominant H3N2 strain in the 2005 season was A/California/7/2004. In HI tests with 
postinfection ferret sera the majority of influenza A(H3N2) viruses from February 2005 to October 2005 
were closely related to A/California/7/2004, as reported by WHO on October 7, 2005 [35]. Surveillance 
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data from Victoria, Australia, show that 45% of influenza A infections were A/California/7/2004-like 
(H3), 11% were A/Wellington/ 1/2004 (H3) and 44% were A/New Caledonia /20/99-like (HI), as col- 
lected in the 2005 flu season [36] . Surveillance data from New Zealand also show that the dominant H3N2 
strain in the 2005 flu season was A/California/7/2004 [37]. 

We plot the protein distance for the H3N2 viruses in the 2003-2004 flu season in Fig. 6(c). We only 
use the data as of September 30, 2004, earlier than the October 2004 date when the WHO published 
the influenza vaccine recommendation for Southern hemisphere. We plot the kernel density estima- 
tion in Fig. 6(a) for the data in Fig. 6(c). There are three major clusters in Fig. 6(a). The one on 
the left is the current dominant cluster which are mostly A/Fujian/422/2002-like viruses. There is a 
middle cluster centered on A/ Wellington/1/2004. The one on the right contains A/California/7/2004. 
Both the A/California/7/2004 cluster and the A/ Wellington/ 1/2004 cluster are antigenically novel from 
A/Fujian/411/2002. 

When the protein distance map and kernel estimation as of February 1, 2004, is plotted in Fig. 6(d) 
and (b), we stiU see the A/ Wehington/ 1/2004 cluster. With these data, the A/California/7/2004 cluster is 
no longer observed. Thus, A/California/7/2004 cluster is a newly appearing cluster and we consider it to 
be the emerging strain. The new cluster which contains A/California/7/2004 is separate from the current 
dominant cluster. We calculated the average ]?epitope of the new cluster that contains A/California/7/2004 
with regard to A/Fujian/41 1/2002 to be 0.112. This makes the new cluster fulfill both criteria for an 
incipient dominant strain cluster. So we predict based on the information as of September 30, 2004, that 
A/California/7/2004 will be the next dominant strain after A/Fujian/41 1/2002 in Southern hemisphere. 
We further predict from these data that A/California/7/2004 will be the dominant strain in the following 
flu season in the Northern hemisphere. These predictions agree with the observed dominant strain in the 
2005 flu season. 

Discussion 

The evolution of influenza virus is driven by cell receptor distributions, non-specific innate host defense 
mechanisms, cross immunity [10,11], and other contributions to viral fitness. In this paper, we focused on 
HA protein evolution under antibody selection pressure. The degree to which the immunity induced by 
one strain protects against another strain depends on their antigenic distance [16]. Because the human 
immune response to viral infection is not completely cross protective, natural selection favors amino-acid 
variants of the HA protein that allow the virus to evade immunity, infect more hosts, and proliferate. 
Mutant strains surround the dominant strain and group into a cluster rather than evolve in a defined 
direction [22,23]. After the virus has circulated in population for one or more years, effective vaccines and 
cross immunity of the population drive the evolution of influenza by mutation and reassortment. This 
evolution increases the immune-escape component of the fitness of new strains, and eventually causes a 
new epidemic. These new immune-escape strains will form a new cluster, and the old clusters will die 
out, thus starting a new cycle. This process of creating of new clusters is what our method detects. 

The low dimensional clustering can be used not only in genetic sequences but also on distances 
calculated from inhibition assays of antibody and antigens, as developed by Lapedes and Farber [21] 
and Smith et al. [22]. The inhibition assay provides an approximation of antigenic distance and is 
broadly used as a marker for vaccine efficacy. The inhibition assay suffers from low resolution of data, 
which multidimensional scaling improves, and is less able to predict the vaccine efficacy than the j^epitope 
method [16]. The genetic sequences used here are a direct description of the evolution of pathogen 
and antigenic distance of influenza. To aid vaccine selection, the low dimensional clustering on genetic 
sequences appears informative. 

Challenges may arise in application of the method described here. If two or more new clusters appear 
in one season, additional information is needed to decide which cluster should be chosen for vaccine. 
Fortunately, it has been shown that the evolution of influenza is typically in one direction [11,22]. It is 
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rare to have two or more new clusters in the protein distance map in one season. As experience with the 
low dimensional sequence clustering is gained, it may be that cluster structure will allow more precise 
prediction of vaccine efficacy. Despite these issues, the method described here can assist the design of 
vaccines, and it provides a new tool to analyze influenza viral dynamics. We did not see any false positive 
results in Table 1. 

The current WHO method works quite well in many years. The method discussed here appears to 
offer an additional tool which may provide additional utility. 

Materials and Methods 
Data sources 

Influenza hemagglutinin A(H3N2) sequences before October 1, 2008, and A(HINI) sequences as of De- 
cember 5, 2009, were downloaded from NCBI Influenza Virus Resources [24]. All hemagglutinin sequences 
used in our study are filtered by removing identical sequences. Thus, all groups of identical sequences in 
the dataset are be represented by the oldest sequence in each group. This approach reduces the number 
of sequences by keeping only the unique sequences in the dataset. The hemagglutinin proteins of 2009 
A(HINI) used in our work are listed in Supplementary Table 3. The numerical labels in Figs. 1 and 2 
are the same as the labels in the first column of Supplementary Table 3. Infiuenza A(H3N2) sequences 
after October 1, 2008, were downloaded from GISAID database, see Supplementary Table 6. GISAID 
has the latest H3N2 sequence data. 

Geographical spread pattern of 2009 A(HINI) 

It is believe that the 2009 A(HINI) virus was most likely originated from Mexico [3]. It first spread to 
the neighboring country USA and then to other countries. We display this geographical spread pattern 
in Fig. 1. We take the founder-Fl relationship from Fig. 1, and assume the virus spreads from location of 
founder to the location of Fl. We consider three regions: USA, Mexico and other countries except USA 
and Mexico. Then we count the cases of spreading from one region to another region. In Supplementary 
Table 2, we show that we observed many more paths of spreading from the USA to other countries than 
from other countries to the USA. The major path of spreading is from USA to other countries. This result 
indicates our directional evolutionary map of Fig. 1 is in good agreement with the pattern of geographical 
spread. 

Multidimensional scaling 

The goal of multidimensional scaling is to represent the distance of proteins by a Euclidean distance in 
coordinate space. We calculate the distance between proteins i and j, dij, by the number of amino acid 
residue diff"erences divided by the total number of amino acid residues, as deflned by Equation 1 in the 
main text. To do multidimensional scaling, we start with the distance of the proteins. The object of 
multidimensional scaling is to find the two, or p in general, directions that best preserve the distances 
dij between the N proteins 

N 

F=J2{dij-Dijf (2) 

Here, D^j = \\xi — Xj\\ is the Euclidean distance between proteins i and j in the projected space, and 
II • II is the vector norm. The algorithm is as follows. Let the matrix A = [{aij)], where a^j = —\d^y The 

eigenvalues of A are 71 , 72 , . . . , 7Ar and 71 > 72 > ... >7JV. Let = {v^^\v'i\ ...,v^^^ ) be the eigenvector 
of 7i and V^'^^ — {v^^^ ^v^^ ^ v^^) be the eigenvector of 72. Let x = ^/tTV^^^ and y = -s/yiV^^^ . The two 
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coordinates in Figs. 2-6 are x and y. The x-axis in the protein distance map is the largest eigenvector. 
We take H3N2 2008-2009 season as an example. In Fig. 5(c), we observe two clusters. One cluster is 
on the right side of figures with x value positive and another one has negative x values. We define the 
consensus sequence of a group of flu strains by taking the most frequent amino acid at each position. We 
calculate the consensus sequences both for the strains in the cluster on the right and on the left of figure. 
We found amino acids at four positions (76, 160, 172, 203) are different for these two consensus H3N2 
strains, see Supplementary Table 1. Interestingly, the Shannon entropy calculated from all 2008-2009 
season sequences at these four positions (0.43, 0.67, 0.59, 0.50) are the largest, which means the diversity 
at these four position are the largest. 

There is software available to run the multidimensional scaling. We use the Matlab function "CMD- 
SCALE" to generate slr N x p configuration matrix Y. Rows of Y are the coordinates of N points in 
p-dimensional space. The "CMDSCALE" also returns a vector E containing the sorted eigenvalues of 
what is often referred to as the "scalar product matrix," which, in the simplest case, is equal to YY^. If 
only two or three of the largest eigenvalues E are much larger than others, then the matrix D based on 
the corresponding columns of Y nearly reproduces the original distance matrix d. We used the influenza 
H3N2 in 2001-2002 season as an example. The five largest of all 180 eigenvalues are 0.0361, 0.0032, 
0.0024, 0.0020, 0.0016. The first two largest eigenvalues contribute 70% to the sum of all 180 eigenvalues, 
which indicates p = 2. Then, we plot the the TV points in a two-dimensional graph. Each point represents 
a protein. The Euclidean distance between any two points Dij on the graph should be equal to or close 
to the distance of these two proteins, that is, Dij ^ dij. As an example, in Supplemental Fig. 1, We 
show that Dij and d^j have a strong linear relationship. A short MATLAB program of multidimensional 
scaling is as follow. 
7o Multidimensional scaling. 

7o alignment . aln is a sequence mult i alignment file 

% generated by software ClustalW. 

clear 

Sequences = multialignread( ' alignment . aln' ) ; 
distances = seqpdist (Sequences , 'Method' , 'p-distance ') ; 
Y = cmdscale (distances) ; 
scatter(Y(: ,1) , Y(: ,2)) ; 

Pepitope estimation 

The value of Pepitope is a measure of antigenic distance between influenza A vaccine and circulating 
strains. The hemagglutinin protein has five epitopes. The dominant epitope for a particular circulating 
strain in a particular season was taken as that which had the largest fractional change in amino acid 
sequence relative to the vaccine strain. The value of Pepitope is defined as the fraction of number of 
amino acid differences in the dominant epitope to total number of amino acids in the dominant epitope. 
The antigenic distance between the vaccine strain and the circulating strain is quantified by ^epitope- 
By a metaanalysis of historical vaccine efficacy data from over 50 publications, Gupta et at. showed in 
a metaanalysis that the ^epitope between vaccine strain and circulating strain correlates well with the 
vaccine efficacy, with E? > 0.8 [16]. The value of ^epitope can be easily calculated from sequence data. 

Biases in the data 

There are two biases in the sequence data. First, more isolates are sequenced in recent years. Generally 
speaking, more sequences make the vaccine selection based on low- dimensional clustering methods more 
reliable. That is why we compared low-dimensional clustering methods with WHO results only since 
1996 in Table 1. To avoid these biases in the generation of the figure of evolution history of influenza for 
the 40 years (Fig. 3), we choose 20 random isolates for each season, even though the database contains 



11 



more sequences in recent years. Second, most isolates are collected in USA. We found that many isolates 
collected in USA are identical, because of the high sampling rate in USA. To reduce this bias, we collapse 
redundant strains, keeping only distinct strains. 
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Figure 1. The evolutionary path of 2009 A(HINI) influenza. Strain #1: A/Cahfornia/05/2009. Strain 
#2: A/Cahfornia/04/2009. Strain #7: A/Cahfornia/07/2009. Strain #12: A/Texas/05/2009. Strain 
#28: A/New York/19/2009. For complete strain names, see supplemental material. Strains from the 
Northern and Southern hemisphere are shown as red dots and blue dots respectively. One branch 
represents one substitution in the amino acid sequence. 
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Figure 2. (a), Kernel density estimation for the protein distance map of 2009 A(HINI) influenza as of 
December 5, 2009. (b), The protein distance map of 2009 A(HINI) influenza. The vertical and 
horizontal axes of both figures represent protein distance as defined in Equation 1. A 0.0018 unit of 
protein distance equals one substitution in the HA protein sequence of HlNl. The height and colors in 
(a) both represent the density of isolates. 
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(a) 




(b) 

Figure 3. (a) The protein distance map and (b) corresponding Kernel density estimation of influenza 
from 1968 to 2007. The vertical and horizontal axes of both figures represent protein distance as defined 
in Equation 1. A 0.0030 unit of protein distance equals one substitution in the HAl protein sequence of 
H3N2. The colors in (a) represent the time of collection of the isolates. The colors and height in (b) 
represent the density of isolates. Each cluster is named after the first vaccine strain in the cluster. 
HK68: Hongkong/ 1/68, EN72: England/42/72, VT75: Victoria/3/75, TX77: Texas/1/77, BK79: 
Bangkok/1/79, PP82: Philippines/2/82, SC87: Sichuan/2/87, BJ89: Beijing/32/92, SD93: 
Shandong/9/93, JB94: Johannesburg/33/94, WH95: Wuhan359/95, SN97: Sydney/5/97, PM99: 
Panama/2007/99, FJ02: Fujian/41 1/2002; 
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(c) (d) 

Figure 4. (a) Kernel density estimation and (c) protein distance map for H3N2 viruses between 
October 1, 2002 and February 1, 2003. (b) Kernel density estimation and (d) protein distance map for 
H3N2 viruses between October 1, 2001, and September 9, 2002. We plot a dotted line to separate the 
two clusters. The vertical and horizontal axes of all figures represent protein distance as defined in 
Equation 1. A 0.0030 unit of protein distance equals one substitution of the HAl protein sequence of 
H3N2. 
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Figure 5. (a) Kernel density estimation and (c) protein distance map for H3N2 viruses from October 
1, 2008, to June 14, 2009. (b) Kernel density estimation and (d) protein distance map for H3N2 viruses 
between October 1, 2008, and March 30, 2009. The vertical and horizontal axes of all figures represent 
protein distance as defined in Equation 1. A 0.0030 unit of protein distance equals one substitution of 
the HAl protein sequence of H3N2. 
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Figure 6. (a), Kernel density estimation for the protein distance map for H3N2 viruses between 
10/01/2003 and 09/30/2004. (b), Kernel density estimation for the protein distance map for H3N2 
viruses between 10/01/2003 and 02/01/2004. (c), Protein distance map for H3N2 viruses between 
10/01/2003 and 09/30/2004. We plot a dotted line to separate the two clusters, (d), Protein distance 
map for H3N2 viruses between 10/01/2003 and 02/01/2004. The vertical and horizontal axes of all 
figures represent protein distance. A 0.0030 unit of protein distance equals one mutation of the HAl 
protein sequence of H3N2. 
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Table 1. Summary of results 



Flu season 


Vaccine strain 


Our prediction 


Circulating 


Circulating 




from WHO [30] 




H3N2 strain 


subtype 


1996-1997 


Wulian/359/95 


Wulian/359/95 


Wuhan/359/95 


H3 


1997-1998 


Wulian/359/95 


Wuhan/359/95 


Sydney/5/97 


H3 


1998-1999 


Sydney/5/97 


Sydney/5/97 


Sydney/5/97 


H3 


1999-2000 


Sydney/5/97 


Sydney/5/97 


Sydney/5/97 


H3 


2000-2001 


Panama/2007/1999 


Panama/2007/1999 


N/A 


HI 


2001-2002 


Panama/2007/1999 


Panama/2007/1999 


Panama/2007/1999 


H3 


2002-2003 


Panama/2007/1999 


Fujian/411/2002 


N/A 


HI 


2003-2004 


Panama/2007/1999 


Fujian/411/2002 


Fujian/411/2002 


H3 


2004-2005 


Fujian/41 1/2002 


Fujian/411/2002 


Fujian/411/2002 


H3 


2005-2006 


California/7/2004 


California/7/2004 


California/7/2004 


H3 


2006-2007 


Wisconsin/67/2005 


Wisconsin/67/2005 


Wisconsin/67/2005 


H3 


2007-2008 


Wisconsin/67/2005 


Wisconsin/67/2005 


N/A 


HI 


2008-2009 


Brisbane/10/2007 


Brisbane/10/2007 


Brisbane/10/2007 


H3 


2009-2010 


Brisbane/10/2007 


BritishColumbia/RV1222 /09 


BritishColumbia/RV1222/09 


HI 


2010-2011 


Perth/16/2009 


BritishColumbia/RV1222/09 


N/A 


N/A 



This table includes the H3N2 vaccine strains, our prediction of dominant strains, the reported domi- 
nant circulating H3N2 strains [38-53], and the circulating subtypes in the northern hemisphere [38-53]. 
Circulating H3N2 strains are absent if the dominant subtype is HI or influenza B. The reported 
dominant H3N2 strains and circulating subtypes data are from WHO Weekly Epidemiological Record 
(http://www.who.int/wer/en/). 



Low-dimensional clustering detects incipient dominant influenza 
strain clusters 

Jiankui He^, Michael W. Deem^'^ 

1 Department of Physics & Astronomy, Rice University, Houston, Texas, USA 

2 Department of Bioengineering, Rice University, Houston, Texas, USA 
* E-mail: Corresponding mwdeem@rice.edu 



Supplementary Data 



2 




Figure 1. Plot of Euclidean distances of proteins as in Figure 4(d) of main text on x-axis and plot of 
distance of corresponding proteins in y-axis. Closeness to the diagonal measures fidelity of the low 
dimensional projection. A 0.0030 unit of protein distance equals one mutation of the HAl protein 
sequence of H3N2. 
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Spreading path 


Number of cases 


USA to Others 


32 


Others to USA 


1 


Mexico to Others 


1 


Others to Mexico 





Others to others 


6 



Table 1. The geographical spread pattern of 2009 A(HINI). "Others" refers to other countries except 
USA and Mexico. 
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Position in HAl protein of H3N2 


76 


160 


172 


203 


Amino acid in consensus strain 1 
hline Amino acid in consensus strain 2 


Glu 
Lys 


Asn 
Lys 


Lys 
Asn 


Asn 
Lys 


Shannon Entropy 


0.43 


0.67 


0.59 


0.50 



Table 2. Consensus strain 1 is the calculated from all strains in the cluster on the right side of Figure 
5(c). Consensus strain 2 is the calculated from all strains in the cluster on the left side of Figure 5(c). 



Table 3. The 2009 A(HINI) (swine flu) sequences in Figure 1. Data are from NCBI Influenza Virus 
Resources. 



Label 


Accession number 


Country 


Collection date 
(mm/dd/yyyy) 


Virus name 


1 


AOr4192o 


TTO A 

USA 


3/30/2009 


A / Caliiorma / 05/ 2009 


2 


ACP41105 


T TO A 

USA 


4/1/2009 


A/Caliiornia/04/2009 


3 


ACR093d4 


A /r 

Mexico 


4/2/2009 


A /i\ /r ' 1 A A f\o / ^r\r\r\ 

A/Mexico/4108/2009 


4 


AOKUyoYZ 


Mexico 


4/z/zOOy 


A / Mexico / 0955 / 2009 


5 


ACy99ol3 


Mexico 


4/2/2009 


A /i\ /T ' / A A r\o / ^r\r\r\ 

A/Mexico/4108/2009 


6 


ACP44189 


T TO A 

USA 


4/9/2009 


A/Caliiornia/07/2009 


7 


AOP41953 


TTO A 

UbA 


4/9/2009 


A /Caliiorma / 07/ 2009 


8 


AC 136657 


T TO A 

USA 


A Ia A 1 (^r\r\f\ 

4/14/2009 


A/ lexas/04/2009 


9 


ACR49285 


T TO A 

USA 


4/14/2009 


A/ lexas/04/2009 


10 


ACU29959 


T TO A 

USA 


4/15/2009 


A /O A X. ' /T^T~>rvoo 1 (^C\C\C\ 

A/ San Antonio/PR922/2009 


11 


ACy83310 


TTO A 

USA 


4/15/2009 


A /T^i /i r 1 ^r\c\c\ 

A/ lexas/ 15/2009 


12 


ACP41934 


TTO A 

USA 


4/15/2009 


A /T^ lew 1 c^f\f\C\ 

A/ lexas/05/2009 


13 


ACQ99d10 


Mexico 


4/19/2009 


A /i\ /r ' 1 A n 1 (^c\c\c\ 

A/Mexico/4603/2009 


14 


ACR09375 


Mexico 


4/20/2009 


A /t\ /T • / /i o r 1 c\c\c\c\ 

A/Mexico/4635/2009 


15 


ACRo7180 


T TO A 

USA 


4/21/2009 


A 1 IT • Ia o /orkrin 

A /Caliiorma / 13/ 2009 


16 


ACl 36662 


T TO A 

USA 


4/22/2009 


A /Caliiorma / 1 2/ 2009 


17 


AC(c27o383 


TTO A 

USA 


4/22/2009 


A /T J*„ „ 1 C\(\ 1 ^C\C\(\ 

A / Indiana / 09 / 2009 


18 


ACR67176 


T TO A 

USA 


4/23/2009 


A/ Texas/ 10/2009 


19 


AC U 29969 


TTO A 

USA 


4/23/2009 


A /O A X. ' /T~>T~>rvoo 1 (^r\c\c\ 

A/ San Antonio/PR923/2009 


20 


ACR18983 


TTO A 

USA 


A /o A / ^r\r\r\ 

4/24/2009 


A I'X/' /r\o 1 c\r\c\c\ 

A/Kansas/03/2009 


21 


ACR18990 


T TO A 

USA 


A /o A / ^r\r\n 

4 / 24 / 2{j{j9 


A /"\T 'XT 1 /oi /oAr^n 

A/New York/31/2009 


22 


ACR18980 


T TO A 

USA 


A /o A / (^r\r\r\ 

4/24/2009 


Kj lexas/08/2009 


23 


ACy63286 


TTO A 

USA 


A /o A /orkr»n 

4/24/2009 


A/Oliio/07/2009 


24 


ACQ73385 


Canada 


A /o A /o/^/^rv 

4/24/2009 


A //^ 1 ATO /T>"\ n ror lc\r\r\f\ 

A/Canada-NS/RV1535/2009 


25 


ACy76340 


T TO A 

USA 


A /o A / ^r\r\n 

4/24/2009 


A /T/" lr\o 1 ^r\r\n 

A/Kansas/03/2009 


26 


ACQ76386 


T TO A 

USA 


A /o A /o/^/^rv 

4/24/2009 


A/(Jliio/07/2009 


27 


A /^T> /I r»or\o 

ACR49292 


TTO A 

USA 


A /o A / ^r\r\r\ 

4/24/2009 


A 1 IT • Ia a l(^r\r\n 

A 1 Caliiorma /II/ 2009 


28 


A /^"r> A A t A IT 

ACP44147 


T TO A 

USA 


A /or /o/^/^rv 

4/25/2009 


A /AT ~\T 1 1 A c\ 1 c\r\r\c\ 

A/ New York/ 19/ 2009 


29 


AC 136665 


T TO A 

USA 


A /or /^r\r\n 

4/25/2009 


A /AT '\T 1 1 AT 1 (^C\C\C\ 

A/ New York/45/2009 


30 


ACQ63209 


TTO A 

USA 


A /or /c%rvr\f\ 

4/25/2009 


A /AT ~\T 1 /-i o /or»r»ri 

A/ New York/ 12/ 2009 


31 


ACR18986 


TTO A 

USA 


A /or /^r\r\f\ 

4/25/2009 


A /AT "S.7" 1 /i 1 /or^rko 

A/New York/11/2009 


oZ 




TTQ A 


A /OK /onno 
4/ zo/ zuuy 


A/iNew lorK/uy/zuuy 


33 


ACR18994 


USA 


4/25/2009 


A/New York/12/2009 


34 


ACR67173 


USA 


4/25/2009 


A/New York/35/2009 


35 


ACR67169 


USA 


4/26/2009 


A/South Carolina/10/2009 


36 


ACQ76333 


USA 


4/26/2009 


A/South Carolina/09/2009 


37 


ACQ76362 


USA 


4/26/2009 


A/Arizona/02/2009 


38 


ACR49284 


USA 


4/26/2009 


A/South Carolina/09/2009 


39 


ACR08534 


USA 


4/27/2009 


A/Texas/23/2009 


40 


ACR08429 


USA 


4/27/2009 


A/New York/3012/2009 


41 


ACR08526 


USA 


4/27/2009 


A/Florida/04/2009 


42 


ACR49289 


USA 


4/27/2009 


A/ Arizona/04/2009 


43 


ACR38825 


USA 


4/27/2009 


A/Georgia/01/2009 
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AA 


/\v^rL±oyy i 


TTQ A 
U DA 


A /97 /9nnQ 
4/ z / / zuuy 


A /^A/'Qclninrrf r-kn /l 9 /900Q 

A/ vvasningijOn/ iz/ zuuy 


A ^ 
40 


i OoiUO 


Mexico 


A /97 /9nnQ 
Ai 1 zuuy 


A/ iViexico/iniJiiii;io04 / /zuuy 


40 


/\Ly^o440 / 


TTQ A 

U oA 


A /97 /9nnQ 
A{ j zuuy 


A /AToATtr V/^T-V / I <^S9 /900Q 

A/iNew 1 orK/ iooz/ zuuy 


A7 
4 / 


A PP<^71 Q/1 
ALyilO / iy4 


TTQ A 
UoA 


A /9Q /9nnQ 

4/ zo/ zuuy 


A / ijeiaware / uo / zuuy 


A91 
4o 


/i-LyrLiOOZi 


United Kingdom 


A /9S 79000 

4/ Zo/ zuuy 


A /TpTifrlciTirl /I QP^ /900Q 

A / H/ngiana / i y 0/ zuuy 


AQ 

4y 


AUrt4yzy4 


TTQ A 

UoA 


A /9Q /9nnQ 

4/ Zo/ zuuy 


A / ijeiaware / uo / zuuy 


ou 


/\VyDy4004 


TTQ A 

U DA 


A /9R /900Q 
4/ Zo/ zuuy 


A /T^fil QAsr^rfi /OR /900Q 

A / ueiawai e / uo / zuuy 


Oi 


/\Lyoy404o 


TTQ A 
UDA 


A /9Q /900Q 

4/ Zo/ zuuy 


A /T^ol oTTT-oviQ /07 /900Q 

A / ijeiaware / u / / zuuy 


oz 


/lLyrl0040U 


TTQ A 
U DA 


A /98 /900Q 

4/ Zo/ zuuy 


A /IVoAur V/^T-V QQ /900Q 

a/incw lorK/ oiyy/ zuuy 


Oo 


A PTQQriQ/l 
ilLy i ooUy4 


Germany 


A /9Q /9nnQ 

4/ zy/ zuuy 


A /T^OTriQT«n /<^9 /900Q 

A / ijayern / oz / zuuy 


^A 
04 


J\\^ i OoiUo 


Mexico 


A /9Q /900Q 

4/ zy/ zuuy 


A/ iviexico/ iniJrLii/io4y4/ zuuy 


00 


AL7ix4oyO / 


France 


/I /9Q /9nnQ 
4/ zy/ zuuy 


A /PoT-io /9i^7Q /900Q 

A / i^aris / zo / 0/ zuuy 


00 


A PP^^71 7Q 
ilvyilO / i 1 y 


TTQ A 
U DA 


/I /"^O /900Q 

4/ ou/ zuuy 


A /T^onf nr>l-A^ /Ol^ /900Q 

A / iventucKy / uo / zuuy 


/ 




France 


/I 7*^0 /9nOQ 

4/ ou/ zuuy 


A /PoT-io /9P^QO /900Q 

A / r aris / zoyu / zuuy 


Oo 




TTQ A 
U DA 


/I /"^O /900Q 

4/ ou/ zuuy 


A /McnTtr V/-»T-V /'^9'^9 /900Q 

A/iNew 1 orK/ ozoz/ zuuy 


oy 


A PP 1 QQ9n 
ALvixioyZU 


Hong Kong 


4/ou/ zuuy 


A /TTr^nrr \^ r\Tn re / C\^ /900Q 

A/riong ivong/ ui/ zuuy 


ou 


J\\j i o0004 


TTQ A 
U DA 


A /'^O /900Q 

4/ ou/ zuuy 


A /PQlif/^T-niQ /^ Q /900Q 

A/ v^aniornia/ ry/zuuy 


Oi 


iWu i oOOOi 


TTQ A 
UDA 


A /*^0 /900Q 

4/ou/ zuuy 


A /T/2k-rm/2kOO/24/2k /07 /900Q 

A/ lennessee/Ui /zuuy 


oz 


/vLyrloiOoo 


France 


IX /I /900Q 

o / L 1 zuuy 


A /Pqt-ic /9i^Q1 /900Q 

A/ r aris/ zoy i / zuuy 


Oo 


A PQ979QQ 
AL^oZ / Zoy 


TTQ A 
U DA 


0/ i/ zuuy 


A /ATcnTcr /QQ07 /900Q 

A/iNCw I orK/ oou / / zuuy 


04 


A PTTi '^1 nn 


TTQ A 
U DA 


1^/1 /9nnQ 
0/1/ zuuy 


A /PQlifr^r-niQ /97 /9nnQ 

A/ Lydiiioi nici/ z / /zuuy 


00 


ALv i oOOOU 


TTQ A 
UDA 


0/ i/zuuy 


A /Pol^f/^T-ri^o /90 /900Q 

A/ Lyaniornia/ zu / zuuy 


00 


A PP<^71 7» 
Av^ilO 1 i / o 


TTQ A 
U DA 


IX //I /9nOQ 

0/4/ zuuy 


A /A/ovTYionf ICM. /900Q 

A/ Vermont/ Uo/ zuuy 


/ 


A PTTI *^nQ^^ 

/Ai^u iouyo 


TTQ A 
UDA 


c: /900Q 

0/ 0/ zuuy 


A /TovQQ /*^0 /900Q 

A/ lexas/ou/ zuuy 


Oo 


A PQ79<^i^n 
ALyO / ZOOU 


TTQ A 
U DA 


ix /IX /9nOQ 

0/0/ zuuy 


A /A/Ti/^lni rron /Ol^ /900Q 

A / iviicnigan / uo / zuuy 


oy 


A PQ7Qn9^^ 
Av^o / oUZO 


TTQ A 
UDA 


0/0/ zuuy 


A /IVoiTr Vtm-V/QQI^I /900Q 

A/iNCw 1 orK/ ooOi/ zuuy 


/ u 


A PQQ1 Ann 
/vLyoy i4uu 


France 


/900Q 

0/0/ zuuy 


A /Qf T-QcK/^iiT-rf /9^^1 1 /900Q 

A/ DtrasDourg/ zoii / zuuy 


71 


A PT<^Qi no 
PiXj i Ooiuy 


Canada 


1^ /7 /9nnQ 
0/ / /zuuy 


A /Pono/^o PP /P"\/"1 7i^Q /900Q 

A/ L^anaQa-ic^Ly / xxv i / oy / zuuy 


79 


A PTf^SI 1 1 

i OOiii 


Canada 


/o /9nOQ 

0/ o/ zuuy 


A /PQTiQrlQ QT^ /PA^I 70"^ /900Q 

A/ LvanaQa-Div/ xxv i / yo/zuuy 


( 6 


A PP '^^^*^'^9 
/Ai^ilOOoOZ 


Sweden 


o/o/ zuuy 


A /Qf r>w/^W>/^lYvi /9Q /900Q 

A / DtocKnoim / zy / zuuy 


7/1 
i 4 


A PP "^900^^ 

/ALyixozyyo 


Finland 


IX /I n /900Q 

/ iu/ zuuy 


A/r mianQ/ 000/ zuuy 


7^ 
( 


A PP QQQ77 
ALvilooo / / 


China 


0/ iu/ zuuy 


A /Qlnon/^/^nrr /I /900Q 

A / Dnanaong / i / zuuy 


7<^ 
/ 


A PQ77QQ^^ 

iiLyo / / yyo 


TTQ A 
U DA 


IX /I /I /9nOQ 

/ i4/ zuuy 


A /IVoiTtr V/^T-V /"^/l^^"^ /900Q 

A/ IN ew 1 orK / o40o / zuuy 


77 


i o0U4y 


TTQ A 
UDA 


^ lAf{ /900Q 

0/ io/ zuuy 


A /IVoTTtr V/^T-V /'^'^09 /900Q 

A/iNew lorK/oouz/ zuuy 


7R 
/ o 


A PP/1<^QQ1 

/\.L^rt4oyy i 


Japan 


IX /I <^ /900Q 

/ io/ zuuy 


A /PcqVq /I /900Q 

A/ v^saKa/ i / zuuy 


7Q 


A PQQ1 /in9 
ALvoy i4UZ 


France 


o / L ( 1 zuuy 


A /PoT-io /9f^i^O /900Q 

A/ r aris/ ZOOU/ zuuy 


oU 


A PCl'^AA/1 
ALyOODD40 


China 


IX /I o /9nnQ 
/ lo/ zuuy 


A / ^UcingznouDij / ui/ zuuy 


oi 


A P<il79<^4<^ 
AL^o / Z040 


TTQ A 
U DA 


IX /I Q /900Q 

/ iy/ zuuy 


A / A T-iry/^na /OQ /900Q 

A / Ai izona / uy / zuuy 


82 


ACT68114 


Canada 


5/20/2009 


A/Canada-MB/RV1964/2009 


83 


ACT82512 


Brazil 


5/20/2009 


A/Sao Paulo/2056/2009 


84 


ACS34967 


Japan 


5/21/2009 


A/Sakai/2/2009 


85 


ACT68117 


Canada 


5/21/2009 


A/Canada-MB /RV1982 /2009 


86 


ACR67262 


Italy 


5/21/2009 


A/Milan/UHSRl/2009 


87 


ACU29999 


Taiwan 


5/22/2009 


A/Taiwan/T1339/2009 


88 


ACS34968 


Japan 


5/22/2009 


A/Shiga/1/2009 


89 


ACS27770 


Russia 


5/22/2009 


A/Kaluga/0 1/2009 


90 


ACS91399 


France 


5/22/2009 


A/Paris/2670/2009 
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iW^rL04y04 


PlniriQ 


IX /9Q /9nnQ 
/ zo/ zuuy 


i\/ Zjnejiang/ 1/ zuuy 


Q9 


/ALyXlO / Z04 


PV.^no 

LyiLina 


0/ Zo/ zuuy 


A /T^/2kiii-nrr //I /9nnQ 

iA/ oeijing/ 4/ zuuy 


yo 


A PT/^SI ni 
1 OolUl 


Canada 


/ Z4/ zuuy 


iA / L^anaQa-iviij / rtv zuzo / zuuy 


Q/l 

y4 


A P"R 7Qi^QQ 


Russia 


0/ zo/ zuuy 


A /A/\r\c^r^r\^TT /TTA/"09 /900Q 

i\. 1 ivioscow / 11 V uz / zuuy 


yo 




r iniana 


0/ zo/ zuuy 


A /TTiTilcmrl /'^'^A/900Q 

i\/ r iniana / oo4 / zuuy 


yo 


A PTQQ1 9Q 
ALv 1 ooi-Zo 


Brazil 


1^ /9Q /9nnQ 
o/zo/zuuy 


A /A/Tof Pr-r^oo/^ /9q9Q /900Q 

A/iviato Lrrosso/zozy/zuuy 


Q7 

y / 


iivy u ouuuy 


Taiwan 


IX /oo /9nnQ 
/ Zo/ zuuy 


A /TQiinrQin /Tl 77^^ /900Q 

iA/ laiwan/ 11/ / 0/ zuuy 


yo 


A PT^^ftl 1 ft 

J\\y 1 Ool lO 


Canada 


1^ /9Q /9nnQ 
o/zo/zuuy 


A /Panorlo A/TT^ /PA/^901 ^ /900Q 

iA/ vyanaaa-iviij / riv zuio / zuuy 


QQ 

yy 


1 / yozo 




0/ Zo/ zuuy 


A /QliQTirfliQi /<^0'T /900Q 

/A/ onangnai/ ou 1 / zuuy 


1 nn 

iUU 


A PTQ9C^1 /I 
/VLy 1 oZ014 


Brazil 


/9Q /9nnQ 
D/zy/zuuy 


A /Qor-K Ponl/^ /99<^1 /900Q 

A/oao r aulo/ zzoi/ zuuy 


iUi 


-r\- IlO O O O 




^ /9Q /9nnQ 
/ zy/ zuuy 


A /P n Qn rrrl/^n (T /O^ /900Q 

-f\/ vruanguong/ uo/ zuuy 


1 no 


A PT91 QQ9 

ALy 1 ziyoz 


Sweden 


1^ /9Q /9nnQ 
o/zy/zuuy 


A /Qf r^/^Vln/^lvYi /Q1 /900Q 

A/ oiocKnoim/ oi / zuuy 


lUO 


-r\.VyOOOOZZ 




p: /qi /9nnQ 
0/01/ zuuy 


A /yVif^iiano- /9 /900Q 

r^j Zjiiejiang/ z/ zuuy 


iU4 


A PT91 Q71 
1 ziy / 1 


Sweden 


/I /9nnQ 
0/1/ zuuy 


A /Qf r>w/^W>/^lm /Qt^ /900Q 

A / aiocKnoim / oo / zuuy 


iUO 


A PQQ/1 i^i^l 
/lLvoy4001 


TTQ A 


/I /9nnQ 
0/1/ zuuy 


A /Pln/^rlci Tclanrl /0/1 /900Q 

/A/xinoQe isianQ/ U4/ zuuy 


iUD 


ALyD04Z0Z 


Japan 


<^ /9 /9nnQ 
o/z/zuuy 


A /T'/^l.^noViim /I /900Q 

iA/ lOKusnima/ 1 / zuuy 


iU / 


A PQQ1 "^QS 
iAL^oy loyo 


France 


/9 /9nnQ 
o/z/ zuuy 


A /Pqt-ic /9799 /900Q 

iA/ r aris/ z / zz/ zuuy 


iUo 


T3 A TTQC;Q9Q 

o/AxiyooZo 


Japan 


/q /9nnQ 

0/o/zuuy 


A /Qo^f omo /900Q 

iA / oaiiama / 00 / zuuy 


1 HQ 

iuy 


A PQ/1 1^0"^ 1^ 
/lLya40UoO 


TTQ A 


a /IX /9nnQ 

0/0/ zuuy 


A /PQlifr^T-niQ /O/l /900Q 

iA/ Lyamornia/ U4/ zuuy 


iiU 


A PT7Q/^9/l 
AL^ 1 / yoz4 


China 


0/0/ zuuy 


A /QViovirrVio; /I /IQT /900Q 

A / onangnai / i4o 1 / zuuy 


ill 


A PT8'^7'^7 


iiaiy 


0/ o/ zuuy 


A /A-nr>r^-no /OI 79000 

A / AnconcL/ ui/ zuuy 


119 
iiz 


A PT1 riQI 
ALy 1 lUolO 


Hong Kong 


/1 1 /9nnQ 
0/11/ zuuy 


A /TTr>wnrf T<^/^n rr /9Q^^Q /900Q 

A/nong ivong/ Zooy/ zuuy 


1 1 


APTR'^7'^ft 
/\Vv 1 oo / oo 


iiaiy 


R /1 1 /9nnQ 
o/ 1 1/ zuuy 


A / A nnon Q /09 /900Q 
i\. / i\ncona / uz / zuuy 


11/1 


/Ai^oyzouu 


Japan 


/I ^ /onno 
0/ 10/ zuuy 


A /TTf omnrM-niTT-o /l /900Q 

A/ u isunomiya / 1 / zuuy 


11^ 
110 


A PTT1 7'^SQ 

u 1 1 ooy 




R /1 7 /9nnQ 
0/ 1 < / zuuy 


A l^c^wj V^nrlr /41 Q7 /900Q 
A/iNew 1 OI K/ 4iy / / zuuy 


1 1 
110 


A PT7Q1 

1 / yiOO 


TTQ A 


/I s /9nnQ 
0/ 15/ zuuy 


A /T^of Inoo/^ o /QPC^OQ /900Q 

A/ oeinesQa/ or ouo / zuuy 


117 
11/ 


AP^zL^m 7 


PViins^ 

v^nma 


R /I ft /9nnQ 
0/ lo/ zuuy 


A /Naniincy/I /900Q 

i\/ IN anj ing / J- / zuuy 


1 1 R 
llo 


A PTT'^m 1 9 
r\.\j U OUl IZ 


TTQ A 


/I o /9nnQ 
0/ lo/ zuuy 


A /QiK^or Qr^rincf /QPI^OQ /900Q 

A/onver opring/ or ouy/zuuy 


119 


ACU30121 


USA 


6/18/2009 


A/Silver Spring/SP510/2009 


120 


ACT22055 


Chile 


6/19/2009 


A/Puerto Montt/Bio87/2009 


121 


ACU30073 


Colombia 


6/25/2009 


A/Bogota/0466N/2009 


122 


ACU13129 


China 


6/29/2009 


A/Changsha/78/2009 


123 


ACT79133 


Japan 


6/29/2009 


A/Japan/1070/2009 


124 


ACT66162 


Singapore 


6/30/2009 


A/Singapore /TLLOl /2009 


125 


ACT83739 


Italy 


7/1/2009 


A/Ancona/04/2009 


126 


ACT82516 


Brazil 


7/3/2009 


A/Sao Paulo/43812/2009 


127 


ACT83741 


Italy 


7/12/2009 


A/Ancona/05/2009 
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Table 4. The H3N2 sequences in Figure 4(c). Data are from NCBI Influenza Virus Resources. 



Label 


Accession number 


Country 


Collection date 

/ /II/ \ 
(mm/dd/yyyy) 


Virus name 


1 


ACF36384 


Hong Kong 


10/1/2002 


A/Hong Kong/CUHK50200/2002 


2 


ACC66393 


A j_ 1 • 

Australia 


10/2/2002 


A /VICTORIA /432 /2002 


3 


ABI92632 


A J. 1 • 

Australia 


10/15/2002 


A/ Western Austraha/36/2002 


4 


ACC67637 


France 


-1 r\ /-I f 1 c\r\r\c\ 

10/15/2002 


A/LYON/19989/2002 


5 


A007793z 


TTC A 
UbA 


10/30/2002 


A / Hawaii / Hl-Uz-JToo / zUUz 


6 


ACC66357 


South Korea 


10/31/2002 


A/Kwangju/219/2002 


7 


ACC77933 


T TO A 

UbA 


11/1/2002 


A /AT "\ 7" 1 /r*o 1 C\f\f\C\ 

A/New York/ 28/ 2002 


8 


ACC77934 


Bulgaria 


11/2/2002 


A /o P /o /I o /r»r\r\r» 

A/Sona/343/2002 


9 


ACC67720 


Bulgaria 


11/2/2002 


A /0/^T~iT A /o /I A / (~\r\f\(~\ 

A/SOFIA/344/2002 


10 


ACC66354 


South Korea 


-1 -1 /-I A 1 c\r\r\c\ 

11/14/2002 


A/Incheon/260/2002 


11 


ACC66358 


bouth Korea 


1 -1 /-I o /o/^/^o 

11/18/2002 


A/KYONCBUK/304/2002 


12 


ACC66369 


South Korea 


11/20/2002 


A/Pusan/504/2002 


13 


ACC66345 


South Korea 


11/20/2002 


A/Cheonnam/323/2002 


14 


ACC77936 


China 


11/21/2002 


A/Beijing/178-NEW-ACN/2002 


15 


ACC77937 


T~»1 

Philippines 


11/21/2002 


A /Philippmes/PH- 1 1 59050 / 2002 


16 


ACC66344 


South Korea 


11/29/2002 


A/Cheju/274/2002 


17 


ACC77939 


China 


1 o /-I /o/^/^o 
12/1/2002 


A/Anhui/550/2002 


18 


ACC67682 


France 


12/1/2002 


A/PARIS/207/2002 


19 


ACC77940 


T TO A 

USA 


12/2/2002 


A/ Alaska/ AK-RSP-02-753/2002 


20 


ACN32523 


Italy 


12/2/2002 


A / /-I /i^/^/^O 

A/Cenoa/1/2002 


21 


AB037625 


South Korea 


12/2/2002 


A/Korea/770/2002 


22 


ACC77942 


T TO A 

USA 


12/3/2002 


A/Caliiornia/CA-T02-3025/2002 


23 


ACC66374 


o • 

Singapore 


12/7/2002 


A /OTAT/^ A "n»/^T~»TT^ 1 A A 1 r\r\r\C\ 

A/SINCAPORE/44/2002 


24 


ACC77944 


Taiwan 


12/9/2002 


A / Taiwan / T W - 1521/2002 


25 


ACC77945 


Turkey 


1 o /-I o /o/^/^o 

12/13/2002 


A/Turkey/TU-RSP-02-1035/2002 


26 


ACC77946 


Taiwan 


•1 o /-I r' /o/^/^o 

12/15/2002 


A/Taiwan/TW- 1522/ 2002 


27 


ACF36415 


Hong Kong 


12/18/2002 


A /TT TT" / /^T TTTT/" r 1 OO /O/^/^O 

A/Hong Kong/CUHK53123/2002 


28 


ACC77948 


China 


12/20/2002 


A/Beijing/301/2002 


29 


ACC77949 


T TO A 

USA 


12/22/2002 


A /TT •• /TTT /^o Ar\r'-t /or\r\o 

A/Hawaii/HI-02-4051 /2002 


30 


ACC67657 


Latvia 


12/23/2002 


A /T A rTTV TT A /-I 0^7/^ A / c^r\r\c\ 

A /LATVIA / 13794/2002 


31 


ACC77950 


T TO A 

USA 


12/24/2002 


A/Hawaii/HI-02-4055/2002 


32 


ABO10182 


Japan 


12/25/2002 


A /T7" J /-I /^i^ /o/^/^i^T~^ 

A/Kumamoto/102/2002E 


33 


ACC77951 


T TO A 

USA 


1 o /oo /o/^/^r» 

12/28/2002 


A /TT "" /TTT /^r* A r\f\c\ /c\r\r\c\ 

A /Hawaii/HI-02-4092 /2002 


34 


ACC77952 


TTO A 

USA 


12/29/2002 


A/North Carolina/NC-C02-4981 /2002 


35 


ACC66516 


Singapore 


1/7/2003 


A/SINGAPORE/7/2003 


36 


ACC66445 


United Kingdom 


1/8/2003 


A/ENGLAND/1/2003 


37 


ACC67602 


USA 


1/8/2003 


A /Washington / WA- V13081 1 /2003 


38 


ACC67661 


Germany 


1/14/2003 


A/BAYERN/1/2003 


39 


ACC67377 


Egypt 


1/14/2003 


A/Egypt/EG-2002923226/2003 


40 


ACC67440 


USA 


1/15/2003 


A /Michigan /ML VC76 /2003 


41 


ACC66517 


Singapore 


1/15/2003 


A/SINGAPORE/16/2003 


42 


ABO20960 


Australia 


1/15/2003 


A/Sydney/015/03 


43 


ACC66485 


USA 


1/16/2003 


A/MEMPHIS/1/2003 


44 


ACC66486 


USA 


1/16/2003 


A/Memphis/2/2003 
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45 


ACF36424 


Hong Kong 


1/18/2003 


A/Hong Kong/CUHK5627/2003 


46 


ACF36426 


Hong Kong 


1/20/2003 


A/Hong Kong/CUHK5723/2003 


47 


ABB03112 


USA 


1/20/2003 


A/New York/485/2003 


48 


ACC67688 


Ireland 


1/24/2003 


A/IRELAND/1215/2003 


49 


ACC66560 


USA 


1/28/2003 


A/ WYOMING/2/2003 


50 


ACC66521 


Bulgaria 


2/1/2003 


A/SOFIA/141/2003 
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Table 5. The H3N2 sequences in Figure 4(d). Data are from NCBI Influenza Virus Resources. 



Label 


Accession number 


Country 


Collection date 

/ /II/ \ 

(mm/dd/yyyy) 


Virus name 


1 


ABI92544 


A J. 1 • 

Australia 


10/2/2001 


A /"vnr J. A J. 1 • / 1 ^ /o/^/^i 

A/ Western Australia/ 17/2001 


2 


ACC66320 


A j_ 1 • 

Australia 


10/29/2001 


A /O J 1 A J 1* /-I r\c^ 1 c\r\f\-\ 

A/South Australia/ 102/2001 


3 


ACl^ 36383 


Hong Kong 


11 /o /orkrki 

11/8/2001 


A /TT ™ TT" ^ //^T TTTTT" tr r^rkOrv 1 c%c\r\-x 

A/Hong Kong/CUHK50080/2001 


4 


AAX56570 


T TO A 

USA 


11/22/2001 


A /AT 'KT 1 /^71 /O/^/^-l 

A/New York/71/2001 


5 


AAX56490 


T TO A 

USA 


11/27/2001 


A /AT AT" 1 / 1 o /I /orvrii 

A/New York/ 124/2001 


6 


A A AT" -1 1 o r' 

AAX11625 


T TO A 

USA 


12/15/2001 


A /AT AT" 1 /OO /0/^/^1 

A/New York/83/2001 


7 


AAX35861 


T TO A 

USA 


1 o /oo / c\r\r\t 

12/23/2001 


A /AT AT" 1 /l 01 /0/^/^1 

A/New York/131/2001 


8 


AAY28335 


T TO A 

USA 


12/25/2001 


A /AT "VT" 1 1 C\ A 1 C\C\C\-\ 

A/New York/94/2001 


9 


ACF36396 


Hong Kong 


12/26/2001 


A /TT TT" / /^T TTTT/" r 1 /I O /I /0/^/^1 

A/Hong Kong/CUHK51424/2001 


10 


ACF36397 


Hong Kong 


12/26/2001 


A /TT TT" / /^T TTTT/" r 1 /< O 1 /0/^/^1 

A/Hong Kong/CUHK51431/2001 


11 


ACl^ 36398 


TT ™ TT" ™ 

Hong Kong 


1 o /oo /o/^/^i 

12/28/2001 


A /TT ™ TT" ™ //^TTTTTy^ri /I rk/^ /O/^rkl 

A/Hong Kong/CUHK51490/2001 


12 


AAX12761 


T TO A 

USA 


12/29/2001 


A /at AT" 1 /o A 1 c\r\c\-\ 

A /New York/ 84/ 2001 


13 


AAa5d44(J 


TTC A 
USA 


lz/31/zUOl 


A/New York/oo/zUOl 


14 


ACC77886 


Cnina 


1/1/2002 


A /m" •" / r' 1 c\r\r\c\ 

A/Tianjin/5/2002 


15 


ACC77887 


T TO A 

USA 


1/1/2002 


A /Louisiana / 2/ 2002 


16 


ACC77888 


T TO A 

USA 




A/Oklalioma/2/2002 


17 


A T~) A /I o o o r 

ABA42335 


T TO A 

USA 


1/4/2002 


A /AT "\7" 1 //l/^O /ork/^O 

A/New York/ 403/ 2002 


18 


ACC77889 


T TO A 

USA 


1/5/2002 


A/Utan/6/2002 


19 


ACC77891 


Cnma 


1/7/2002 


A/Beijing/20/2002 


20 


ABA42989 


T TO A 

USA 


1/7/2002 


A /AT "\ 7" 1 / /I r' /p»/^/^r\ 

A/New York/405/2002 


21 


ACC66370 


o • 

Singapore 


1 /^T /o/^/^r\ 

1/7/2002 


A /OTAT/^ A T~>/^"r»T7i /O /or\/^0 

A/SINGAPORE/2/2002 


22 


ACC66386 


A j_ 1 • 

Australia 


1/9/2002 


A /"V 7'T/^'T~l/^T~> T A /1 /^o /i^r\r\r\ 

A/ VICTORIA/103/2002 


23 


ACC66387 


A x 1 • 

Australia 


1 //^ /o/^i^r* 

1/9/2002 


A/ VICTORIA/105/2002 


24 


ACF36405 


Hong Kong 


1/10/2002 


A /TT TT" / /^T TTTT^ PTi rT\ /r»/~\r\0 

A/Hong Kong/CUHK5250/2002 


25 


ALb 36406 


Hong Kong 


1/10/2002 


A /TT ~. Ty ~. //^T TTTT/" cr o cr 1 /or»r»o 

A/Hong Kong/CUHK5251/2002 


26 


ACC77895 


Cnma 


-1 /-I f\ 1 C\f\f\C\ 

1/10/2002 


A/ Wuhan/ 1 2/ 2002 


27 


ABA42368 


T TO A 

USA 


1/11/2002 


A /AT "\ 7" 1 / /I O /o/^/^O 

A/New York/408/2002 


28 


ACC66544 


A J_ 1 • 

Australia 


1/12/2002 


A /"\ rT/^rn/^T~»T A /-t rxd /nr\r\f^ 

A /VICTORIA / 102 /2003 


29 


ACF36411 


Hong Kong 


1/12/2002 


A /TT T7" / /^T TTTTT" f C%r\/^ / C%r\r\C% 

A/Hong Kong/CUHK5296/2002 


30 


A A ~\r f r> A r> r\ 

AAX56460 


T TO A 

USA 


-I /-I r> 1 c\r\r\c\ 

1/12/2002 


A /AT AT" 1 /or* /or^r^o 

A/New York/89/2002 


31 


ABR15962 


New Zealand 


1/14/2002 


A / A 11 1 1 r' A A 1 c\r\r\c\ 

A/ Auckland/614/2002 


32 


ABA42379 


USA 


1/14/2002 


A /AT ~\T 1 / 1 c\r\r\c^ 

A/New York/409/2002 


33 


AAY47023 


T TO A 

USA 


1/14/2002 


A /AT "\ 7" 1 / 1 or' /o/^/^r* 

A/New York/ 125/2002 


34 


A "O A A C% A C)>~7 

ABA42487 


T TO A 

USA 


1/16/2002 


A /AT AT" 1 / /I "1 o /rk/^r\o 

A/New York/418/2002 


35 


AAY28648 


USA 


1/16/2002 


A/New York/107/2002 


36 


ACC77896 


China 


1/17/2002 


A/Wulian/16/2002 


37 


ABR14636 


Norway 


1/17/2002 


A/Oslo/398/2002 


38 


AAX12771 


USA 


1/17/2002 


A/New York/91/2002 


39 


AAX47525 


USA 


1/18/2002 


A/New York/96/2002 


40 


ABA42498 


USA 


1/18/2002 


A/New York/419/2002 


41 


ACC77897 


Hong Kong 


1/19/2002 


A/Hong Kong/1550/2002 


42 


AAX56580 


USA 


1/20/2002 


A/New York/106/2002 


43 


AAX12791 


USA 


1/22/2002 


A/New York/100/2002 


44 


ABA42401 


USA 


1/22/2002 


A/New York/411/2002 
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40 


ilV^V^ 1 1 OyO 


TTQ A 


1 /oq 79009 
1 / Zo / ZUUZ 


A 7l\ToTAr V/^T-V 7l ^ 79009 
A/lNeW lOiK/lO/ZUUZ 


40 


ALyLyOOo / 1 


Singapore 


1 /9/1 79009 
1 7 Z4/ ZUUZ 


A 7QTTSTP A POP TP 77 79009 
A/ DliN vjrAx^l^Xlll;/ / /ZUUZ 


A7 
4 / 


/\/\y^0001U 


TTQ A 


1 797 79009 

1/ z / 7 ZUUZ 


A 7l\Tc>AT5r Vr^T-V 7l "^9 79009 
A/iNeW I OiK/ lOZ/ ZUUZ 


4o 


/llJljU4yoO 


TTQ A 


1 /9Q /9009 
1 7 Zo/ ZUUZ 


A 7l\Ti2»TTr V/^T-V 7/1 90 79009 
A/iNew lOrK/4ZU/ZUUZ 


AO 
4y 


A AVI 1 
/\/\yvl lOoO 


TTQ A 

U oil 


1 /9Q /9009 

1 7 zy 7 ZUUZ 


A /AToTiT^ VriT-V / I 1 /9009 
A/iNeW 1 OiK/ 1 lU/ ZUUZ 


ou 


1 44041 


TTQ A 


1 /QO 79009 
1 7 oU/ ZUUZ 


A 7l\ToTT7 Vr-wT-V 7l 1 Q 79009 

A/iNew lorK/ iio/zuuz 


01 


i\lJVj4oUoZ 


New Zealand 


1 /'^^ 79009 

1/ ol/ ZUUZ 


A 7AA/^QiVQf 79 79009 

A/ vvaiKato/ z/ zuuz 


DZ 


i\i\yS.000yU 


TTQ A 
U oA 


1 /"^l 79009 
1/ol/ZUUZ 


A 7l\Ti2»TTr V/^T-V 7l OQ 79009 

A/iNew lorK/ lUo/zuuz 


Oo 


A A VI 9sni 
/\./\y^lZoUl 


TTQ A 
U DA 


1 7^^! 79009 
1/ ol/ ZyjyjZ 


A 7l\Tcnnr V/^T-V 711"^ 79009 
A/iNeW 1 OiK/ 1 10/ ZUUZ 


04 


A R A /1 9/1 1 9 
/Vlj/V4Z41Z 


TTQ A 

U oA 


9 7l 79009 
Z / A- / ZUUZ 


A 7l\TiQiT7 Wm-V 7/119 79009 

A/i\ew iorK/4iz/zuuz 


00 


AljlZlOUo 


TTQ A 
U OA 


9 7l 79009 
Z/ 1/ Z\j\JZ 


A 7l\ToTAr V^rk-rV 7P/1 /9009 
A/i>eW 1 OiK/ V^4/ ZUUZ 


00 


A PP77Qni 


TTQ A 
U oA 


9 /I /9009 
Z/ 1/ ZxjxjZ 


A 7A/roooo/^V>no/2if f o /9 /9009 

A / iviassacnusexxs / z/ zuuz 


/ 


A PP77Qn9 


TTQ A 

U oA 


9 7/1 79009 
Z 1 ^1 ZUUZ 


A 7TvToAnr V/^-rV 718 79009 
A/iNeW 1 OiK/ lo/ ZUUZ 


Oo 


A A V9»ni A 
I ZoU14 


TTQ A 

U oA 


9 7/1 /9009 
Z 1 ^1 ZUUZ 


A 7TsT/2nT7 VrM-l^/IOI /9009 

A/iNew lorK/ lui/ ZUUZ 




A A Vi^<^/1 70 
/\./\.y^004 / U 


TTQ A 
U DA 


9 7<^ 79009 


A 7l\Tcnnr V/^T-V 7Q0 79009 

A/iNew 1 OiK/ yu/ ZUUZ 


ou 


A Rr^/I QHQQ 
/lljVor4oUyo 


New Zealand 


9 7l O /9009 
Z / lU/ ZKjKjZ 


A 7"\A7ollinrff /^^ /9009 

A / vvemngton / o/ zuuz 


01 


1 440ZU 


TTQ A 
U DA 


9 719 79009 
Z 1 LZj ZUUZ 


A 7l\ToTAr V/^-rV 771^ 79009 

A/iNew iorK//0/zuuz 


oz 


A A V/17'^1 ^ 
i\i\yS.4 / 010 


TTQ A 
U DA 


9 719 79009 
Z/ IZ/ ZUUZ 


A 7lVoTi7 V/-»T-V 7l 9Q 79009 

A/iNew 1 OIK/ izy/ ZUUZ 


Oo 


A A V^i^S71 


TTQ A 
U DA 


9/1/1 /9009 
Z / 14/ ZUUZ 


A 7l\T/2>ATr Vr^T-V 7l "^1^ 79009 
A/iNeW 1 OiK/ 100/ ZUUZ 


04 




Australia 


9 /I Q /9009 
Z / lo/ ZUUZ 


A /PPTQP A 1\TT? 7"^ 79009 
A/ oHlDoAlMlj/ 0/ ZUUZ 


00 


A A V^778/1 


TTQ A 
U DA 


9 /I Q 79009 

z/ ly/ ZUUZ 


A /ATo^^sr Vr-irV / I 90 /9009 
A/INGW I OIK/ IZU/ ZUUZ 


00 


A PP77Qn/l 


TTQ A 
U DA 


9 791 79009 

Z 1 ZVj ZyjyjZ 


A 7l\Tr^T«fV> r^oT-r^lin 7^^ 79009 

A/iNorin L^aronna/ 0/ ZUUZ 


o < 


Av^V^OOOOO 


Japan 


9 79^^ /9009 
Z / zo / ZUUZ 


A /FTTT^TTOT^ A /I ^ /9009 
A/ P U xV U WxVA / 10 / ZUUZ 


Oo 


ALyLy / / yuo 


TTQ A 
U DA 


9 79/1 79009 
Z / Z4/ ZUUZ 


A 7Tllinr^^c7l 79009 

A / ininois / 1 / ZUUZ 


oy 


R A P9S99i^ 
ij/vVorZoZZO 


Japan 


O /OK 79009 
Z / ZO / ZyjyjZ 


A /AAmrimh-Ci II.A 79009 

A / ivionoKa / 04 / zuuz 


70 
/ U 


A AV9SQ9C^ 
/V/V 1 ZooZO 


TTQ A 
U DA 


9 /9f^ /9009 
Z / ZO/ Z\J\JZ 


A 7l\T/2nT7- VrM-l^ /QQ /9009 

A/iNew lorK/oo/zuuz 


71 
/ 1 


A AVI 975^1 
AAyvlZ / ol 


TTQ A 
U DA 


9 /9<^ 79009 
Z / ZO / ZKjyjZ 


A 7l\Tcnnr Vr^yV /Q^ 79009 

A/iNew 1 OiK/ yo/ ZUUZ 


79 


A R A /1 9/1 7(\ 
AijA4Z4 i 


TTQ A 
U DA 


9 79Q 79009 
Z / Zo/ Z\j\jZ 


A /TsToiTr Wm-I^ 7/1 1 79009 

A/iNew lorK/ 410/ ZUUZ 




i\i\yvoOo41 


TTQ A 
U DA 


q /I 79009 
o / L 1 ZUUZ 


A 7TsTcnnr V/^-rV 7S<^ 79009 
A/iNeW 1 OiK/ 00/ ZUUZ 


7/1 
/ 4 




Hong Kong 


q /I /9009 
o/ 1/ Z\J\jZ 


A 7TTrMnrf T<"rMn rr 7 PT TTTT^ 1 *^9 1 <^ 79009 

A/nong ivong/ Uxiiviozio/ ZUUZ 


7P^ 
/ 


R A P9S99^^ 
I3/\VorZoZZO 


Japan 


q /I 79009 
o/ 1/ ZyjyjZ 


A 7l\/T/^T-i/^VQ 1^0 79009 

A / ivionoKa/ oz / zuuz 


7(K 
/ 


AAZjOOOoy 


TTQ A 
U DA 


q 7l 79009 
o/ 1/ Z\j\jZ 


A 7TsTi2>iTr Wm-V 71 9<^ 79009 

A/iNew lorK/ izo/zuuz 


77 


A PF'^<^999 
i\Lyr oDZZZ 


Hong Kong 


q /q 79009 


A 7TT/^nfr T^/-knfr 7 PTTTTT^I "^9/10 79009 
A/nong ivong/ Lv u riivioz4y/ ZUUZ 


78 
/ o 


A PTr*^^^99*^ 
ALyr ovZZo 


Hong Kong 


q 7/1 79009 
0/4/ ZUUZ 


A 7TT/-mrr T^r^nrr 7PTTTTT^1 *^97» 79009 

A/nong ivong/ u Urixvioz / 0/ ZUUZ 


7Q 


/11joOUU04 


New Zealand 


q 7/1 79009 
0/4/ Z\J\JZ 


A 7 A n/^Vlcsnrl 7<^1 1^ 79009 

A / AUCKianQ / 1 0/ ZUUZ 


oU 


A PP77QnQ 

ALvLv / / yuy 


Tin oil on/^ 

1 nauanQ 


q 7/1 /9009 
of ^/ ZyjyjZ 


A /Tlnoilovi/^ /I 9Q/100 /9009 

A/ 1 nanano / izo4uu / zuuz 


Ol 




TTQ A 


/'^ /9009 
O/ O/ ZUUZ 


A /TTflwflii /zL/9009 

1\ 7 ITLcLVVcLii 7 4: 7 ZiUUZ 


89 
oZ 


APP77Q1 1 
A-V^Lv / / y 11 


Hong Kong 


q 7<^ 79009 
0/0/ ZvvZ 


A /TTr»n(T T^rkrifr /I 91 71 1^ /9009 

A/nong ivong/ izi / 10/ ZUUZ 


83 


AAY98077 


USA 


3/7/2002 


A/New York/276/2002 


84 


ACC66376 


Thailand 


3/8/2002 


A/SONGKHLA/107/2002 


85 


AAX57734 


USA 


3/11/2002 


A/New York/130/2002 


86 


AAY28405 


USA 


3/21/2002 


A/New York/122/2002 


87 


ACC77912 


Hong Kong 


3/25/2002 


A/Hong Kong/568/2002 


88 


ACC66323 


Thailand 


3/25/2002 


A/BANGKOK/109/2002 


89 


ACC77913 


USA 


4/3/2002 


A/Washington/3 /2002 


90 


ACC77914 


Brazil 


4/4/2002 


A/Ceara/177/2002 


91 


ACF36278 


Hong Kong 


4/4/2002 


A/Hong Kong/CUHK21713/2002 
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09 


A PP^^'^^I 
/VL^L^DDoDi 


New Caledonia 


A /A /9009 
4/4/ ZyjyJZ 


A /IVoTAr Palorl/^niQ /^ /9009 

iA/i>ew L^aieQonia/ 0/ ZUUZ 


yo 


A PTr*^^^9»1 
iAL^r oOZoi 


Hong Kong 


A /» /9009 
4/0/ ZUUZ 


A /TT/-m rr T^/-kTi rr 7PTTTTT^91 7/19 79009 
iA/xlOng ivOng/LyUxiivZi /4Z/ZUUZ 


y^ 


A nV^f\9^A 
i\y r ouZo4 


nong, ivoiig, 


4/11 /9009 
4/ i i/ ZUUZ 


A /J^n-ncr T<^ r»n cr /PT TPT1^91 Q ^^7 /9009 

/\/nong ivoiig/ u riivziyo < / zuuz 


Qi^ 

yo 


A ^^7/1 1 7 


New Zealand 


/I /I 9 /9009 
4/ iZ/ ZUUZ 


A /AA/^oiVo^-/^ /i^ /9009 

/!/ vvaiKato/ 0/ ZUUZ 


yo 


APP7701 
/Vv^Ly 1 / y iO 


TTQ A 

U oil 


A /I 7 /9009 
4/ i / / Z\J\JZ 


A /AToKt-cjgVq /I 9 /9009 

/A/ IN eorasKa/ iz/ zuuz 


y / 


A Rr^/1 9Q71 


New Zealand 


A /9009 
4/ ZD/ ZyjyJZ 


A /Ponf OT-Knr-TT- /01 /9009 

A/ LvanierDury / ui / zuuz 


yo 




Montr / /2»o 1 o n 

iNew zjeaianQ 


^ /9 /9009 

/ Zi / ZUUZ 


A /Panfp^rKnrA^ /I ZL /9009 

-Ti- / v^aniei ULiiy / 14/ zuuz 


00 

yy 


A PTr*^^^900 

iAi^r oDzyy 


Hong Kong 


1^ /I /9009 
0/ iU/ ZUUZ 


A /W/-»Tirr T^r^n rr /PTTTTT^9*^1 1 /9009 

A/riong ivong/ UxiivZoi lo/ zuuz 


1 on 


APF'^R'^OO 


noiig, rvoiig, 


^ /I /9009 

U/ lU/ ZUUZ 


A /PTnncr Knn cr /PT Tm^9'^1 R9 /9009 
iT./ nuiig, ivuiig,/ v_y u rixvzoioz/ zuuz 


1 ni 
iUi 




Hong Kong 


1^ /I 9 /9009 
/ iZ/ ZyjyjZ 


A /W/-^nrr r^n rr /PT TWT^9q 1 QO /9009 

A/riong ivong/ UxIiyZoIoU/ zuuz 




APP7701 
( / y iO 


Argentina 


^ A /9009 
/ i4/ ZyjyjZ 


A / A T'fri2kn+in Q /l /9009 

i\/ /Argentina/ 1/ zuuz 


iUo 


A PP7701 7 


Peru 


^ lA A /9009 
0/ i4/ ZUUZ 


A /Pi2»T'n /Q090 /9009 

A / i^eru / ouzy / zuuz 


1 OA 


A PP7701 8 
ilLyvy / / yio 


onma 


r /I /I 79009 
/ 14/ Z\j\jZ 


A /PnoTirf^Vi/^n /"^Q/l /9009 

i\ 1 'oruangznou / oy4 / zuuz 


iUO 




Australia 


^ /90 /9009 
0/ ZU/ ZUUZ 


A /A/TPTOPTA /f^Of^/9009 
A/ V iv^ 1 i^xli A / OUO 7 ZUUZ 


iUD 


A PP7701 


Brazil 


t\ /no 79009 
/ Zo / Z\J\JZ 


A /"RT-Qryil /I 797 /9009 

A/ jDrazn/ l i z i / zuuz 


1 07 
iU / 




Hong Kong 


f< /I /9009 
0/1/ ZDDZ 


A /TT/^nrr T<^/^n rr /PTTTTT<^9/1 0/1 /I /9009 

A/rlong ivong/ UrlivZ4U44/ ZUUZ 


iUO 




noiig ivong 


f\ /9 /9009 
yj / z / ZUUZ 


A /TTrkTicr T^r»ncr /PTTm^9A0^A /9009 

A/nong ivong/ u X11VZ4U04/ zuuz 


1 00 

luy 




Hong Kong 


^ 70 79009 
D/o/ ZUUZ 


A /TT/-»n rr T^r^n rr /PT TTTTr9/l 11/1 /9009 

A/JlOng iV0ng/LyUxiivZ4ii4/ZUUZ 


110 
iiU 




Hong Kong 


<^ 7/1 79009 
0/4/ ZyjyjZ 


A lT^r\Tncr T^/^n rr /PT TTn^9/l 1 ^^7 /9009 

A/nong xvong/ u rixVZ4io / j zuuz 


111 
iii 


A PP77Q90 

ALvLv / / yzu 


Hong Kong 


f< lf\ /9009 
0/0/ ZkJDZ 


A /TT/-wnrr T<^,-Kn rr / 1 1 /I Q /9009 

A/ riong ivong/ ii4o/ zuuz 


119 


A RTQ9'^'^'^ 


New Zealand 


f{ /7 /9009 
0/ / / ZUUZ 


A /T^iTnorlin /l /9009 

A / uuneom / i / zuuz 


1 1 Q 
iio 


A PP<^<^Q9Q 


Australia 


f< 77 79009 
0/ / /ZUUZ 


A /RPTQR A 1\TTh^ /i^ /9009 
A / JDilioJD AlN i-^/ 0/ ZUUZ 


1 1 A 


A RPi^09f^i^ 
iAij V^OUZOO 


New Zealand 


<^ 7q 79009 
0/0/ ZUUZ 


A 7P Q nfoT-Ki ivAf /A7 /9009 

A/ v^anteroury/ 4 / / zuuz 


111^ 


iAij xlZooOO 


New Zealand 


a 70 79009 
0/0/ ZUUZ 


A / A n/^VloTi/^ /^^Oft /9009 

A / AucKiana / ouo / zuuz 


1 1 <^ 

iio 


A RPi^0900 

/AoL^ouzyy 


New Zealand 


<^ 7l 79009 
0/ lU/ ZUUZ 


A /Panf OT-KiiT-Tr /l^"^ /9009 

A/ LvanterDury/ 00/ zuuz 


117 
ii / 


A RPi^09Q8 


New Zealand 


/I 1 /9009 
0/ 11/ ZUUZ 


A /Pon + /2»vKnT-Tr /l^O /9009 

A / L^anter Dury / ou / zuuz 


115^ 
iio 


A PPAAQA7 
ilv^v^00O4 / 


New Zealand 


71 q 79009 
0/ lo/ ZUUZ 


A /PTTPTQTPTTTTPPTT /*^7 /9009 
A / Lyililio 1 KytL U ilL^ xl / / / ZUUZ 


110 

iiy 




New Zealand 


7l Q 79009 
0/ 10/ ZUUZ 


A /QrMifln Ponf oT-KiiT-TT- /Q7 /9009 

A/ooutn LvanterDury/ / /zuuz 


1 90 
iZU 


iAiJVjO / 4Zo 


New Zealand 


<^ /I 7 79009 
K) 1 L 1 1 ZUUZ 


A /Wc»l 1 in rrf/^n /"^S /9009 

A / vvenington / 00 / zuuz 


1 91 
iZi 


A RP^^70»0 

iAijLyD / yoy 


New Zealand 


f\ /90 /9009 
0/ ZU/ ZUUZ 


A /Ponf /^kT-KiiT-tr /QQ /9009 

A / L^anter Dury / 00 / zuuz 


1 99 


iT- u u o z y 


All C!'i"T*0 1 1 O 

iT-Libii ana 


R /90 /9009 

/ ZU / ZUUZ 


A /RPmR A NP /R /9009 

iT. / J_)IX10J_)iT.i > S2j / U/ ZUUZ 


1 9Q 
iZo 


A RP QQQO^^ 


Australia 


(\ /OK /9009 
0/ ZD/ ZUUZ 


A /AAToof OT-ri A no+T-oli /9Q /9009 

A/ vvestern Ausxrana/zo/zuuz 


1 9Zl 

iZ4: 


A RP^7^^i^'^ 
iAiDLvD / DOo 


New Zealand 


f\ 797 79009 
0/ Z 1 / ZUUZ 


A /Panf OT-KiiT-Af /i^7 /9009 

A/ L^anierDury/ / / zuuz 


1 91^ 
iZO 


A PP77099 

iAL^L^ / / yzz 


Brazil 


fi 797 79009 
0/ Z / / ZUUZ 


A /Rvoryil /17Q9 /9009 

A / ijrazn / 1 / oz / zuuz 


1 9A 
iZO 


A RPSi^0i^9 

/AiDLvooyoz 


New Zealand 


f\ 7qO 79009 
0/ OU/ ZUUZ 


A /Panf OT-Km-TT- /<^0 /9009 

A/ LvanterDury/ ou/ zuuz 


1 97 
iZ ( 


A RP Q7/1 i^O 
Aij^o / 40U 


New Zealand 


7 /I /9009 
/ / 1/ ZUUZ 


A /AAT'oil^of /91 /9009 

A/ vvaiKato/ zi / zuuz 


1 98 

izo 


A PPAA'^9/1 


Tl-ioilonrl 

1 naiianQ 


7/1 /9009 
/ / 1/ ZUUZ 


A /R A ATPl^m^ /I QO /9009 
A / 13 AIN ^IVV^IV / lyu/ zuuz 


1 90 

izy 




Hong Kong 


7 /I /9009 
/ / 1/ ZUUZ 


A /TTr»n(T T^rknfr /PTTTn^'^'^0A7 /9009 

A/nong ivong/ u nivoou4 / /zuuz 


130 


ABC50321 


New Zealand 


7/2/2002 


A /Canterbury/59 /2002 


131 


ACF36340 


Hong Kong 


7/2/2002 


A/Hong Kong/CUHK33079/2002 


132 


ACF36341 


Hong Kong 


7/2/2002 


A/Hong Kong/CUHK33106/2002 


133 


ACC66373 


Singapore 


7/2/2002 


A/SINGAPORE/29/2002 


134 


ABI92346 


New Zealand 


7/3/2002 


A/Waikato/23/2002 


135 


ACC66381 


Australia 


7/4/2002 


A/SYDNEY/23/2002 


136 


ABC68093 


New Zealand 


7/6/2002 


A/Canterbury/72/2002 


137 


ABC85919 


New Zealand 


7/6/2002 


A /Canterbury/70 /2002 


138 


ACC66380 


Australia 


7/7/2002 


A/SYDNEY/21/2002 
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1 "^Q 

ioy 


i\ljVorZOyOO 


New Zealand 


7 /Q 79009 

/ /y/ zuuz 


A /Finnoriin /I /9009 

i\/ uuneQin/ lu/ zuuz 


1 /in 

i4U 




New Zealand 


7 /Q /9009 

/ /y/zuuz 


A /ATTPT^T A IVFi /9<^ /9009 
iA/ iA U LyxVljiAiN U / ZO / ZUUZ 


1/11 


A RP<^7i^/l 
/loLvO / 04o 


New Zealand 


7 /Q /9009 

/ / y/ zuuz 


A /Pan-f-taT-VMiT-Ar /7Q /9009 

/A/ LvanterDury/ / y/zuuz 


1/19 




New Zealand 


7 /Q /9009 

/ /y/zuuz 


A /PoT1^-i2»T-KnT'T7- /Q1 /9009 

iA/ v^anterDury / oi / zuuz 


1 A*^ 


/Aijiyzooo 


Australia 


7/10 /9009 
/ / lU/ ZUUZ 


A /AA/oofoT-n A nof T-olio /9i^ /9009 

iA/ vvesiern /vusirana/zo/zuuz 


1/1/1 
144 


Ar)U0Uo04 


New Zealand 


7/11 /9009 
/ / 11/ ZUUZ 


A /Ponf OT-Knr-TT- /7l^ /9009 

A/ LvanierDury/ /O/zuuz 


140 


iAovyOUoOO 


New Zealand 


7/11 /9009 
//II/ ZUUZ 


A /Panf ovKnvAf /so /9009 

/A/ L^anteroury/ ou/ zuuz 


1 /l^^ 
140 


A RP/1»1 *^7 
iAlj^4olO / 


New Zealand 


7 /I IT /9009 
/ / 10/ ZUUZ 


A /AA^oU^Tirrf /-in /71 /9009 

A/ vveningion/ / i/zuuz 


1 A7 

14 / 


A PPRfi'^'^n 
/\^^oooou 


/wibti ana 


7 /I ^ /9009 
/ / 10/ ZUUZ 


A /RPTCIR A MP /99 /9009 

i\. / JDrLlkJ-D-Ti-iN H// ZZ / zuuz 


1 /I Q 
14o 


iALyLyOOool 


Australia 


7/17 /9009 
1 / i- ( / Z\J\JZ 


A /RPTQR A 1\TT7 //I /9009 
A / oHIo JD AlN H/ / 40 / ZUUZ 


1 AQ 
14;^ 


APP77Q9*^ 
iw^v^ / / yzo 


I--' 1 1 1 TATA! l^c^ 

± nnippines 


7/17 /9009 

1/11/ 


A /PViiliT^T^infiG /I ^098"^ /9009 

A / r^nnippines / louzoo / zuuz 


1 i^n 
lou 


A PP<^<^Q<^7 
ALvLvOOoO / 


Philippines 


7 /I Q /9009 
/ / 15/ ZUUZ 


A /PTTTT TPPT1\TT7Q //1 71 /9009 
A / r niljlr r llN H/o / 4 / 1/ ZUUZ 


1 f^i 

101 


iAijxvoyyoi 


Australia 


7/91 /9009 
i 1 AVj ZUUZ 


A /AA/oQfoT>n A ncf t-qIi Q /97 /9009 

A/ western Austrana/ z / / zuuz 


1 1^9 
lOZ 


A RPftft^^*^n 

illjicrOOOoU 


New Zealand 


7 /9Q /9009 
/ / Zo/ ZUUZ 


A /AAT^oil^of /-w /Q/^ /9009 

A/ vvaiKaio/ oO/ zuuz 


lOo 


A PP77Q9^^ 

/ALvLv 1 / yzo 


Hong Kong 


7 /9<^ /9009 
/ / ZO/ ZUUZ 


A /TT/^nrf /I 1^1 /9009 

A/ nong ivong/ lolu/ zuuz 


1 

104 


A T^TQ9i^77 

Aoiyzo / / 


Australia 


Q //^ /9009 
o/ 0/ ZUUZ 


A /AAToof OT-n A nof T-oli o /9Q /9009 

A/ vvesiern Ausxrana/zo/zuuz 


100 


iAL^LyOOooO 


Australia 


o /o /9n09 
O/o/ ZUUZ 


A /PPTQP A TMT? /I /l/l /9009 
A/ ortloijAlNIl// 144/ ZUUZ 


100 


A PP77Q9ft 
iALy^y / / yzo 


Brazil 


o /Q /9n09 

o/ y/zuuz 


A /PvQrvil /9/1 i^ft /9009 

A / orazn / Z40o / zuuz 


1 ^7 
10 / 


A RTTm ni A 
/\ljrlUlU14 


China 


o /I 1 /9009 

o/ 11/ zuuz 


A /"FTniiQn //1 1 1 /9009 

A/ rujian/411 / zuuz 


1 i^Q 
lOo 


ALvLvOOooO 


Australia 


Q /I 1 /9009 
o/ 11/ zuuz 


A /PPTQP A 1\TF* /I f^7 /9009 
A/ oHlooAiMi// 10 / / ZUUZ 


1 

loy 




China 


Q /I 1 /9009 
O/ 1 1/ ZUUZ 


A /FniiQn /A1 1 /9009 

A/ r ujiaii / 41 1 / ZUUZ 


lou 


A RPi^l Q71 
Aor Oiy / 1 


China 


Q /I 1 /9009 
5/ 11/ ZUUZ 


A /T^iiT^on //111 /9009 

A/rujian/411 /zuuz 


1 ^^1 

101 


iAij VjOO041 


New Zealand 


0/19 /9n09 

0/ iz/ zuuz 


A /Finnoriin /l 8 /9009 

A/ uuneQin/ lo/ zuuz 


1 ^^9 
lOZ 


A RT^Sni 7Q 
/AoiVoUl / y 


Australia 


0/10 /9n09 
0/ lo/ zuuz 


A /PinooTiclonrI /9*^ /9009 

A / Vc^ueensiana / zo / zuuz 


lOo 




Australia 


0/10 /9n09 

0/ 10/ zuuz 


A /PnciciTiclQTirl /97 /9009 

A/ ^ueensianQ/ z < / zuuz 


1 ^^/i 

104 




Australia 


Q /I Q /9009 
0/ lo/ zuuz 


A /PPTQP A 1\TT7 /I Q9 /9009 
A/ oHloljAiMi// lyZ/ZUUZ 


100 


A PP77Q9Q 

/vLyvy / / yzy 


India 


/I /9009 

0/ 10/ zuuz 


A /TnrliQ /9^^09 /9009 

A / inQia/ zoouz / zuuz 


100 


Aoiyzoyy 


Australia 


Q /I Q /9009 

0/ ly/ ZUUZ 


A /AA/oof OT-n A nof T-oli o /QO /9009 

A/ vvesiern Ausxrana/ou/zuuz 


1 ^^7 
lO / 


A RPA^I 70 


iNew ZjeaianQ 


Q /9n /9009 
/ zu / zuuz 


A /Wf^Uino-frin /7Q /9009 

A/ vvenington/ < y/zuuz 


lOo 


A PP^^^^*^^^/1 
iAl-yl-y00o04 


Australia 


S /91 /9009 

0/ zi/ zuuz 


A /PT^^PTTT //IQ /9009 
A/r H/Hl 11/ 4y/ ZUUZ 


1 <^Q 

loy 


/ALyLvOOoyi 


Australia 


/99 /9009 

/ zz / zuuz 


A /A/TPTPPTA /9'^^ /9009 
A/ V ILy 1 vJiv i A / ZoO / ZUUZ 


1 7n 
1 / u 


A RP "^7/1 79 


New Zealand 


Q /9Q /9009 
0/ Zo/ ZUUZ 


A /AAT'oil^of r>w /i^Q /9009 

A/ vvaiKaio/ oo/ ZUUZ 


1 71 


ARN^I 1 ^4 

XA.JJ1 N O J- J-Ort 


A 1 1 Q^"T*a 1 i ft 


8/98/2002 


A /Wpqfprn A imtr?ili?i / "^4/2002 


172 


ACC66322 


New Zealand 


9/3/2002 


A /AUCKLAND /57/2002 


173 


ACC66342 


Australia 


9/4/2002 


A/BRISBANE/312/2002 


174 


ABR15940 


New Zealand 


9/9/2002 


A/ Auckland/61 1/2002 


175 


ACC66360 


Malaysia 


9/9/2002 


A /MALAYSIA / 145 /2002 


176 


ACC66379 


Australia 


9/15/2002 


A/South Australia/154/2002 


177 


ACC66392 


Australia 


9/16/2002 


A /VICTORIA /254 /2002 


178 


ACC66383 


Taiwan 


9/25/2002 


A/TAIWAN/8/2002 


179 


ABI92621 


Australia 


9/29/2002 


A/Western Australia/35/2002 


180 


ACC66365 


Australia 


9/29/2002 


A/PERTH/89/2002 
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Table 6. The sequences of H3N2 viruses after 10/01/2008. Data are from GISAID. The first column 
label is the same as used in Figure 5. 



Label 


EPI ID 


Country 


Collection date 
(mm/dd/yyyy) 


Virus name 


1 1 


Erll87o70 


Japan 


4/2/2009 


A / A T/^TTT /i cro /or»r»n 

A/ AlCHl/ 158/2009 


2 


T7^"r>T1 07/^71 

EP1187671 


Japan 


4/9/2009 


A / A T/^TTT /i /orvr^rk 

A/ AlCHl/ 161/2009 


3 


EF11d9724 


Japan 


11/26/2008 


A / A jyTT^ A /1 O /orkPkO 

A/AKll A/ 12/2008 


4 


TT^T~»T'1 0*7/^*70 

EP1187672 


Japan 


5/27/2009 


A / A TT'TT^ A /o /I /or\r\ri 

A/ AKll A/34/2009 


5 


EP118582D 


T TO A 

USA 


2/27/2{j{j9 


A/Arizona/08/2009 


6 


EP1185829 


T TO A 

USA 


2/25/2009 


A / A * /i 1 /or^r^rv 

A/Arizona/11/2009 


7 


EP1185805 


Russia 


3/23/2009 


A / A „x «1 1 /7 /or^rkA 

A /Astrakhan / 7/ 2009 


8 


EP1185723 


Chma 


2/10/2009 


A / Beij ng-Xicheng / 1 1 08/ 2009 


9 


EP1189zlo 


A -L 1 • 

Australia 


4/28/2009 


A /T~> ' 1^ /ro /^r\r\n 

A/Brisbane/53/2009 


10 


EP1185704 


Canada 


3/15/2009 


A/British Columbia/RV1222/2009 


11 


EP1185707 


Canada 


3/15/2009 


A /T) 1/^1 "U * „ /t>"\ n ooo /or»r4n 

A/British Columbia/RV 1223/2009 


12 


TT'T^TI 0701 O 

EP1187312 


Japan 


5/1/2009 


A //^TTT"r> A / /I o /o/^r\rk 

A/CHIB A-C /42 /2009 


13 


EP1187ol3 


Japan 


1/16/2009 


A //^TTTT> A /7 /ork/^rv 

A/CHlB A-C/ 7/2009 


14 


bP1175111 


T TO A 

USA 


12/20/2008 


A / Caliiorma / 1 / 2008 


15 


EP11858ol 


T TO A 

USA 


1 /1 /O/^MO 

1/1/2009 


A //^ 1 J /oi / o f\ f\ r\ 

A/Colorado/01/2009 


16 


EP1169244 


T TO A 

USA 


10/15/2008 


A //^ 1 1 //^o /o/^/^o 

A / Colorado / 03 / 2008 


17 


EP117243o 


TTO A 

USA 


10/29/2008 


A / Colorado / 04/ 2008 


18 


EP1185834 


T TO A 

USA 


1/16/2009 


A/Colorado/05/2009 


19 


T7^"r>T1 oroo7 

EP1185837 


T TO A 

USA 


1 /oi /ork/^r\ 

1/21/2009 


A / Colorado / 06 / 2009 






TTQ A 

UoA 


iz/ i / /zUUo 


A/Coloraao/ iz/zUUo 


21 


EP1175120 


T TO A 

USA 


1 o /o A /^r\r\o 

12/24/2008 


A //^ 1 J /1 r /orkno 

A / Colorado / 15/2008 


22 


T7*"P>T1 0701 /I 

EP1187314 


Japan 


1 O / 1 r /o/^/^o 

12/15/2008 


A /TT^TTTIV /TTT^ /O/^ /O/^/^O 

A/EHlME/36/2008 


23 


EP1187315 


Japan 


1/19/2009 


A /tt^ttta /ttt* /rv 1 ^f\c\c\ 

A/EHlJvlE/ 9/ 2009 


24 


TT'T^TI 0701 7 

EP1187317 


Japan 


1 o /o/^ /o/^/^o 

12/26/2008 


A /TT^T TT/'T T/^T/" A /ork /orvrvo 

A/1^ UKUOKA-C/39/2008 


25 


T7^"r>T1 0701 r\ 

EP1187ol9 


Japan 


1 o /o/^ /o/^/^o 

12/26/2008 


A /TT^T TTT'T TOTTTA /r A / 1 O /I /O/^/^O 

A/1^ UK USHIMA/ 124/2008 


26 


EP1185726 


Chma 


1 1 /o/^ /o/^/^o 

11/26/2008 


A /TT^ ■■ O' * / 1 O /I O /O/^/^O 

A/l^ujian-Simmg/ 1242/2008 


27 


EP1185729 


China 


o /o /or\/^A 

3/3/2009 


A /T7* ™ /1 AC\ 1 C\C\C\C\ 

A / 1^ uj lan- 1 ongan /\.^2/ 2009 


28 


EP1187320 


Japan 


1 /oo /orv/^rv 

1/28/2009 


A //^TTT^TT /o7 /orvrvrv 

A/Cll^ U-C/37/2009 


29 


EP1187321 


Japan 


1 /oo /o/^/^r\ 

1/28/2009 


A //^TTT^TT /oo /ork/^rv 

A/Cll^ U-C/38/2009 


30 


TT^T~>T'1 '70 A A C% 

EPll 72442 


Guam 


"1 r» /o'7 /o/^r\o 

10/27/2008 


A/Guam/7124/2008 


31 


EP1187322 


Japan 


o /o /oA/^rv 

2/3/2009 


A /TTTT) /^OTTTA /r A /OA /OAArk 

A/HlROSHlMA-C/20/2009 


oZ 


T?1DT1 Q7Q0Q 
rL/r iio / OZo 


Japan 


i / 0/ zuuy 


A/ JrLiJA.woJrLiiVlA-v^/ / /zuuy 


33 


EPI187674 


Japan 


5/26/2009 


A /HIROSHIMA/148 /2009 


34 


EPI187675 


Japan 


6/10/2009 


A /HIROSHIMA/154 /2009 


35 


EPI187324 


Japan 


12/6/2008 


A /HOKKAIDO /9 /2008 


36 


EPI187325 


Japan 


11/29/2008 


A/HYOGO/6/2008 


37 


EPI187326 


Japan 


12/11/2008 


A/HYOGO/99/2008 


38 


EPI185840 


USA 


1/6/2009 


A/Hawah/02/2009 


39 


EPI185842 


USA 


1/30/2009 


A/Hawah/05/2009 


40 


EPI185845 


USA 


3/28/2009 


A/Hawah/06/2009 


41 


EPI185851 


USA 


3/30/2009 


A/Hawah/07/2009 


42 


EPI185848 


USA 


3/30/2009 


A/Hawaii/07/2009 


43 


EPI185854 


USA 


2/11/2009 


A/Hawaii/ 10/2009 
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AA 


H/i^iioOoO / 


TTQ A 


/I /I r /OfiOQ 

4/ iO/ zuuy 


A /TTQTArQii /I /I /900Q 

i\/ xiawan/ i4/ zuuy 


A ^ 
40 


H/x^iioOoOU 


TTQ A 


/I /I Q /900Q 

4/ iy/ zuuy 


A /TTqti7q^^ /I 1^ /900Q 

A./ xiawan/ io/ zuuy 


40 


H/i^iioOoOo 


TTQ A 


A /90 /900Q 

4/ zu / zuuy 


A /TTQATtrQii /I f\ /900Q 

i\ j riawan / io/ zuuy 


4 / 


T7PT1 79/1 QO 
H/i^ii / z4oU 


TTQ A 


1 1 /I O /900Q 
ii/ iU/ ZUUo 


A /TToTTToii //I7 /900ft 

/v / xiawan / ^( / zuuo 


A91 
4o 


ijjx^iioO / OZ 




9 /I /900Q 

z/ i/ zuuy 


A /TT -n rrii on rr AT q n rro n rr / 1 Q /I /900Q 

j\ / xienongj lang-iN angang / i o4/ zuuy 


AO 

4y 


rL/i^iioO / oO 


China 


9/10 /900Q 

z/ iu/ zuuy 


A /TToilr^nrfiionrf Vi o yi rr fo n rr / 1 /900Q 

/\. / nenongjiang-Aiangiang/ lOo / zuuy 


ou 


rppii QIX77Q 
rL/i^iioO / 1 y 


Honduras 


IX /IX /OfiOQ 

0/0/ zuuy 


A /TT/^nrlnvQC /l^<^ /900Q 

i\/ xionQuras/ 00/ zuuy 


Oi 


H/i^iioO / oZ 


Honduras 


0/ 0/ zuuy 


A /TTz-^Ti/^nT-oo /^qQ /900Q 

j\ 1 nonauras / ooy / zuuy 


oz 


rL/i^iioO / oo 


L^nma 


900Q /oo /oo 

zuuy / uu / uu 


A /TT/^nrr T^/^r> rr /O"^ /900Q 

/A/xiong ivong/ uo/ zuuy 


Oo 


H/i^iioO / 4i 


i^nma 


A l7 /900Q 

^/ i / zuuy 


A IWrsmre T<" r^n rr / 1 QQQ /900Q 

/v/nong ivong/ iyyy/ zuuy 


^A 
04 


TPPT1 9l^7AA 


v^nma 


A l7 /900Q 

4/ / / zuuy 


A /TTr^nrr T^/^n rr /9000 /900Q 

A/xiong ivong/ zuuu/ zuuy 


00 


H/i^iioO < 4 / 


unma 


A 1 (\ /900Q 

4/ 0/ zuuy 


A /TT/^nrr T<"/^ri rr /9007 /900Q 

A/nong ivong/ zuu / / zuuy 


00 


H/i^iioO / OU 


L^nma 


/I /IX /OfiOQ 
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